Wednesday, June 15, 2016

Big Data: Set Your Data Free and Reap the Rewards

Coder fired after 6 years for automating his job Boing Boing 

 Infographic: The ultimate guide to SEO-friendly URLs

Bipartisan Policy Center – Healthy Aging Begins at Home, May 22, 2016 is an official government website designed to accelerate the use of crowdsourcing and citizen science across the U.S. government. The site provides a portal to three key assets for federal practitioners: a searchable catalog of federally supported citizen science projects, a toolkit to assist with designing and maintaining projects, and a gateway to a federal community of practice to share best practices.”

The knowledge economy is a myth. We don’t need more universities to feed it. “The idea of the knowledge economy is appealing. The only problem is it is largely a myth. Developed western economies such as the UK and the US are not brimming with jobs that require degree-level qualifications. For every job as a skilled computer programmer, there are three jobs flipping burgers. The fastest-growing jobs are low-skilled repetitive ones in the service sector. One-third of the US labour market is made up of three types of work: office and administrative support, sales and food preparation.”

“…Make data available. So what do you do? Many disciplines, universities, and federal agencies have started to build repositories, slowly filling caverns of data to mine. The best ones allow for easy uploading and a pathway to making these observations machine-readable, with provenance information and metadata inseparable from the pudding that is your data. Well-known repositories include DRYAD, figshare, KNB, DOE Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics, iPlant, and DataOne. Some scientists are also resorting to places like GitHub, originally built for software code development, but which is now also a decent home for data, figures, and metadata, even labeled with hashtags (see #openexperiment, for example). Some disciplines have created their own metadata formats and units, like Ecological Metadata Language (EML) in Biogeosciences, or Climate and Forecast convention NetCDF in Atmospheric Sciences. Read up on these. Be intrepid and share. Find your repository. Learn about licenses for sharing like Creative Commons. And then you’ll be ready the next time an editor remarks, like in those old Wendy’s ads, “Nice manuscript, but where’s the data?” —Ankur R. Desai, Editor, Journal of Geophysical Research: Biogeosciences; email: [thanks to Darlene]

“The truth is what happens. The world is only what it is.” So says Maria, a character in Mr. Eternity. But understanding the truth, understanding what happens (or what happened in the past, or what will happen in the future) is the problem. History cycles -- nations die and rebuild, slavery comes and goes -- but what we learn from the past is even less than what we learn from the present, hampering our ability to build a better future. That’s one of many lessons that might be taken from Mr. Eternity, an odd, sprawling, very funny account of a very long life. Mr. Eternity by Aaron Thier

The sharing economy and on-demand services are weaving their way into the lives of many in the community, raising difficult issues around jobs, regulation and the potential emergence of a new digital divide.
Shared, collaborative and on demand: thenew digital economy 

LeakedSource is a search-engine capable of searching over 1.8 billion leaked records — an aggregation of data from hundreds of disparate sources. We have been able to accumulate this data over a relatively short period of time through a combination of deep-web scavenging and rumor-chasing. Occasionally these efforts lead to major discoveries…If we come across a leaked database from a company that most people haven’t heard of, we will incorporate it into our master database just the same. You may search for yourself in the leaked credentials by visiting our homepage. If your personal information  appears in our copy of the Twitter credentials, or in any other leaked database that we possess, you may remove yourself for free…”

Behavioral biometrics is used in Sweden, Denmark and Norway, and it’s integrated into a system called BankID, which major banks use to identify their customers. In Sweden, the system has 6.5 million active users. In Norway, it’s used by over 75 percent of the adult population. Banking customers use it for everyday transactions from logging in to bank accounts to filing taxes Google Plans to Kill Passwords With This Tech, but Scandinavia Is Way Ahead of It
Denmark’s new e-Government strategy aims to spur the digitisation of public administration. The e-Government Strategy 2011-2015 focusses on making printed forms or letters redundant, creating new welfare services and spurring the sharing and reuse of available public sector solutions and data
Denmark to accelerate government digitisation

Tax implications for the sharing economy
If you have clients earning income from the sharing economy, they may not know what their tax obligations are.

They may hate him, but with the perspective of three years, it is clear Edward Snowden actually helped the legal intelligence community. [Lawfare

What can be done to tackle the “walking dead” 

Jack Townsend, Seizure of Hard Drive Computer Data by Mirroring and the Fourth Amendment. “Although the issue arose in a tax evasion case, it is an issue that pervades the criminal law given that computers are ubiquitous and can be a mother lode in a criminal investigation.”
Panama Papers Show How Rich United States Clients Hid Millions Abroad New York Times.

 Leandra Lederman (Indiana-Bloomnington) presented Does Enforcement Crowd Out Voluntary Tax Compliance? at Oxford University's Saïd Business School and Vienna University of Economics and BusinessGovernments commonly use deterrence methods, such as audits and the imposition of penalties, to foster compliance with tax laws. Although this approach is consistent with economic modeling of tax compliance, some scholars caution that deterrence may backfire, “crowding out” intrinsic motivations to pay taxes and thus reducing compliance. This article analyzes the evidence to date to determine the extent of such an effect. Field studies suggest that deterrence tools, such as audits, generally are highly effective at increasing tax collections but that crowding out may occur in some contexts, with respect to certain subgroups of taxpayers.

Logi Analytics, the leader in self-service analytics, today released findings from its fourth annual State of Embedded Analytics Report, [reg req’d] which found that the adoption rate of embedded analytics among business users is twice that of traditional BI tools. The report, which studied how organizations embed business intelligence and analytics inside their software applications, revealed that embedded analytics continues to improve user satisfaction and increase end-user adoption of analytic tools.“The report shows that demand for self-service analytics is expanding beyond data analysts to everyday users, who need to monitor and measure key performance indicators,” said Brian Brinkmann, vice president, Logi Analytics. “If organizations want to see these users be successful, they need to offer analytics within the business applications they are using every day.” When users are forced to leave their preferred business applications to conduct analysis, they are less likely to use that analytics tool. The report found that 43 percent of application users leverage embedded analytics regularly, which is double the user adoption rate of traditional BI tools reported in the2015 State of Self-Service Report. Application providers say they expect the adoption rate of embedded analytics to increase to 52 percent within two years…”

Inside an Amazon Warehouse, the Relentless Need to “Make Rate” Gwaker

 “The Berkman Center for Internet & Society is delighted to announce the launch of the Net Data Directory, a free, publicly available, searchable database of different sources of data about the Internet. The directory is intended to make finding useful quantitative data about a broad range of Internet-related topics—broadband, cybersecurity, freedom of expression, and more—easier for researchers, policymakers, journalists, and the public.”

Via BBC – Tom Chatfield 5 June 2016 – “You may be familiar with the statistic that 90% of the world’s data was created in the last few years. It’s true. One of the first mentions of this particular formulation I can find dates back to May 2013, but the trend remains remarkably constant. Indeed, every two years for about the last three decades the amount of data in the world has increased by about 10 times – a rate that puts even Moore’s law of doubling processor power to shame…Here’s the problem with much of the big data currently being gathered and analysed. The moment you start looking backwards to seek the longer view, you have far too much of the recent stuff and far too little of the old. Short-sightedness is built into the structure, in the form of an overwhelming tendency to over-estimate short-term trends at the expense of history…”

Vox data scientist interactive map of mass shootings since Sandy Hook