Pages

Monday, December 16, 2019

How to Use a Data-Scraping Tool to Extract Data from Webpages


Apple CEO Tim Cook says monopolies aren’t bad if they aren’t abused Business 

France Wants To Rein In Big Tech

Digital taxes are only a start, says France’s digital affairs minister. “More important? Targeting the biggest tech companies—most of which are American—with new regulations to prevent them stifling competition and damaging democracy.” (Er, and part of the plan is designed to help French entrepreneurs and their start-ups.) – Wired

Australian commando who raised Afghanistan war crime allegations found dead


Sergeant Kevin Frost, a "fearless" Australian commando who went public three years ago about his involvement in an alleged war crime in Afghanistan, has died — becoming one of the hundreds of former and serving defence personnel who have taken their own lives in the past two decades


GOVERNMENT ACCOUNTABILITY: Two former ACCC executives who became whistleblowers against the ACCC two years ago talked to Crikey Inq journalist David Hardaker about why they did it and what goes on inside the mind of a government whistleblower.

  

US ambassador claims Chinese agents at work on Australian streets

As an Australian student turns to the US for support against alleged
Chinese government harassment, the US ambassador has called on the
Morrison government to take stronger action.



The Fact-Check Industry - Columbia Journalism Review – Has our investment in debunking worked? “…Outside newsrooms, money is pouring in to set up new types of organizations to combat misinformation. There is now a sector of fact-checking philanthropy, fueled by Google, Facebook, and nonprofit foundations. As a result, the Duke count noted, last year forty-one out of forty-seven fact-checking organizations were part of, or affiliated with, a media company; this year, the figure is thirty-nine out of sixty. In other words, the number of fact-checking organizations is growing, but their association with traditional journalism outlets is weakening…”


Another senior Gov.UK bod makes a dash from publicsector, falls into AWS's arms

AWS tech academy, aka Her Maj's Government, trains next gen of Amazonians



How to Use a Data-Scraping Tool to Extract Data from Webpages - maketecheasier – “If you’re copying and pasting things off webpages and manually putting them in spreadsheets, you either don’t know what data scraping (or web scraping) is, or you do know what it is but aren’t really keen on the idea of learning how to code just to save yourself a few hours of clicking. Either way, there are a lot of no-code data-scraping tools that can help you out, and Data Miner’s Chrome extension is one of the more intuitive options. If you’re lucky, the task you’re trying to do will already be included in the tool’s recipe book, and you won’t even have to go through the point-and-click steps involved in building your own…” 


  • For many many other resources and tools to extract data from web pages – please see 2020 Guide to Web Data Extractors – This guide by Marcus P. Zillman is a comprehensive listing of web data extractors, screen, web scraping and crawling sources and sites for the Internet and the Deep Web. These sources are useful for professionals who focus on competitive intelligence, business intelligence and analysis, knowledge management and research that requires collecting, reviewing, monitoring and tracking data, metadata and text.
  •     

The Guardian UK – “…On [December 2, 2019], the Electronic Frontier Foundation published a 17,000-word report on this topic. Behind the One-Way Mirror: A Deep Dive Into the Technology of Corporate Surveillance, by Bennett Cyphers and Gennie Gebhart, covers both online privacy problems and the growth of real-word surveillance. BOWM, for short, explains how personal data is gathered, brokered, and used to serve targeted advertisements. In theory, users should prefer useful adverts to irrelevant ones. In reality, it provides a stream of data to anyone who wants it. Most of us, I suspect, don’t object to the ads as much as to the vast infrastructure used to deliver them. Non-targeted ads are fine with me. As the report points out, when you visit a website, data associated with your online identity will be sent to anyone interested in bidding in an auction to show you a targeted advertisement. A data-snorting company can just make low bids to ensure it never wins while pocketing your data for nothing. This is a flaw in the implied deal where you trade data for benefits. You can limit what you give away by blocking tracking cookies. Unfortunately, you can still be tracked by other techniques. These include web beacons, browser fingerprinting and behavioural data such as mouse movements, pauses and clicks, or sweeps and taps. Data brokers can try to connect whatever information they get to data that you are giving away in other areas. This might include your email address, mobile phone number, location, credit card and store card numbers, your car’s number plate and face recognition data. Some of this information may have been purchased from third parties...
As BOWM points out, real-world identifiers can last a lot longer than your browsers or even your devices. Your main email address, phone number, credit card number and car number plate don’t change very often. Good luck changing, or disguising, your fingerprint and face recognition data. “Gait recognition” is already being used in China. You can run but you can’t hide. Today, we are past the stage where it’s a technology problem. Only governments can protect our privacy by banning the collection of data and giving us the rights both to prevent its collection without explicit permission, and to delete data that has already been collected…”
       
Jeff Bezos warns US military it risks losing tech supremacy FT. Bezos figures out where the real money is.