Wednesday, July 19, 2023

Crawlers, search engines and the sleaze of generative AI companies Search Engine Land

Adam Powick, Deloitte’s chocolate teapot


Gov’t “Advisory Panel” of PwC tax partners axed


 How US Department of Homeland Security Became Global Thought Police Kit Klarenberg, Kit’s Newsletter



Why new facial-recognition airport screenings are raising concerns CU Boulder Today


How an “AI-tocracy” emerges MIT News. Focused on China, but findings would seem to apply elsewhere as well.


Book Review: Richard Vague, “The Paradox of Debt”

And why finding reforms to our current economic system is a Sisyphean task. 


When Artificial Intelligence Becomes a Central Banker

Artificial intelligence is expected to be widely used by central banks as it brings considerable cost saving and efficiency benefits


They Don’t Want Us And We Don’t Need ThemDefector


Bees just wanna have fungi: a review of bee associations with non-pathogenic fungiMicrobiology Ecology


Genes for learning and memory are 650 million years old, study shows (press release) University of Leicester. Somewhat less readable original


Drexel’s Second Coming Marc Rubinstein, Net Interest


Crawlers, search engines and the sleaze of generative AI companies Search Engine Land

Search Engine Land: “…LLMs are not search engines It should now be very clear that an LLM is a different beast from a search engine. A language model’s response does not directly point back to the website(s) whose content was used to train the model. There is no economic exchange like we see with search engines, and this is why many publishers (and authors) are upset. The lack of direct source citations is the fundamental difference between a search engine and an LLM, and it is the answer to the very common question of “why should Google and Bing be allowed to scrape content but not OpenAI?” (I’m using a more polite phrasing of this question.). Google and Bing are trying to show source links in their generative AI responses, but these sources, if shown at all, are not the complete set. This opens up a related question: Why should a website allow its content to be used to train a language model if it doesn’t get anything in return? That’s a very good question – and probably the most important one we should answer as a society. LLMs do have benefits despite the major shortcomings with the current generation of LLMs (such as hallucinations, lying to the human operators, and biases, to name a few), and these benefits will only increase over time while the shortcomings get worked out.  But for this discussion, the important point is to realize that a fundamental pillar of how the open web functions right now is not suited for LLMs…”