Wednesday, December 06, 2023

ChatGPT one year on: who is using it, how and why?

 

Privacy First: A Better Way to Address Online Harms

EFF: “State, federal, and international regulators are increasingly concerned about the harms they believe the internet and new technology are causing. The list is long, implicating child safety, journalism, access to healthcare data, digital justice, competition, artificial intelligence, and government surveillance, just to name a few. 

The stories behind them are important: no one wants to live in a world where children are preyed upon, we lose access to news, or we face turbocharged discrimination or monopoly power. This concern about the impact of technology on our values is also not new—both serious concerns and outsized moral panics have accompanied many technological developments. The printing press, the automobile, the victrola, the television, and the VCR all prompted calls for new laws and regulations. 

Trouble is, our lawmakers seem to be losing the forest for the trees, promoting scattered and disconnected proposals addressing whichever perceived harm is causing the loudest public anxiety in any given moment. Too often, those proposals do not carefully consider the likely unintended consequencesor even whether the law will actually reduce the harms it’s supposed to target…

The truth is many of the ills of today’s internet have a single thing in common: they are built on a system of corporate surveillance. Multiple companies, large and small, collect data about where we go, what we do, what we read, who we communicate with, and so on. They use this data in multiple waysand, if it suits their business model, may sell it to anyone who wants it—including law enforcement. Addressing this shared reality will better promote human rights and civil liberties, while simultaneously holding space for free expression, creativity, and innovation than many of the issue-specific bills we’ve seen over the past decade. In other words, whatever online harms you want to alleviate, you can do it better, with a broader impact, if you do privacy first…”

 

ChatGPT one year on: who is using it, how and why?



Nature  “On 30 November 2022, the technology company OpenAI released ChatGPT — a chatbot built to respond to prompts in a human-like manner. It has taken the scientific community and the public by storm, attracting one million users in the first 5 days alone; that number now totals more than 180 million. Seven researchers told Nature how it has changed their approach.”

See also Tech Policy Press – New Study Suggests ChatGPT Vulnerability with Potential Privacy Implications -” What would happen if you asked OpenAI’s ChatGPT to repeat a word such as “poem” forever? A new preprint research paper reveals that this prompt could lead the chatbot to leak training data, including personally identifiable information and other material scraped from the web. The results, which have not been peer reviewed, raise questions about the safety and security of ChatGPT and other large language model (LLM) systems. 

 “This research would appear to confirm once again why the ‘publicly available information’ approach to web scraping and training data is incredibly reductive and outdated,” Justin Sherman, founder of Global Cyber Strategies, a research and advisory firm, told Tech Policy Press. The researchers – a team from Google DeepMind, the University of Washington, Cornell, Carnegie Mellon, University of California Berkeley, and ETH Zurich – explored the phenomenon of “extractable memorization,” which is when an adversary extracts training data by querying a machine learning model (in this case, asking ChatGPT to repeat the word “poem” forever”). 

With open source models that make their model weights and training data publicly available, training data extraction is easier. However, models like ChatGPT are “aligned” with human feedback, which is supposed to prevent the model from “regurgitating training data.”