Category Archives: Big Data

2017 Conference dates chosen!

Decorative - 4T Data Literacy conference logo

We’re excited to announce that the 2017 4T Data Literacy Virtual Conference dates have been announced! We’ll meet virtually on July 20-21, 2017. This year, we’re focusing our presentations on three (and a half) themes:

  1. Big Data, including citizen science
  2. Ethical data use
  3. Personal data management

Registration and more details will be forthcoming soon. If you registered last year, you’re already on our list and will let you know when it’s time to sign up!

Reading Recommendation: Big Data

The rise of big data can be traced back through history. Viktor Mayer-Schönberger and Kenneth Cukier chronicle its evolution and describe its current state in Big Data: A Revolution that Will Transform How We Live, Work, and Think. I couldn’t put it down!

One defining aspect of big data is its focus on “what” data say. In other words, big data reveals trends and patterns, but it does not explain why they appear or occur. Mayer-Schönberger and Cukier make this observation about correlation and causation:

[i]n a big data world…we won’t have to be fixated on causality; instead we can discover patterns and correlations in the data that offer us novel and invaluable insights. The correlations may not tell us precisely why something is happening, but they alert us that it is happening.

How does this point impact how you understand big data and its impact?

 

Source: Mayer-Schönberger, Viktor, and Kenneth Cukier. Big Data: A Revolution that Will Transform How We Live, Work, and Think. Boston, MA: Houghton Mifflin Harcourt, 2013.

Image: “Sunrise Sky Blue Sunlight Clouds Dawn Horizon” by PublicDomainPictures, on Pixabay. CC0 Public Domain.

Reading Recommendation: Predictive Analytics

When used to make predictions, data can be quite powerful! A common example is the story of the retailer Target’s prediction of a customer’s pregnancy. When the company sent coupons for baby products to a teen, her father complained. However, it turned out that she was indeed pregnant. Such stories can be impressive and concerning. In addition to learning trends and patterns from data, data can lead to new information. In the case of Target and the teen, the store did not just know what the teen bought. Those data suggested more information: her pregnancy. As Eric Siegel writes:

[t]his isn’t a case of mishandling, leaking, or stealing data. Rather, it is the generation of new data, the indirect discovery of unvolunteered truths about people. Organizations predict these powerful insights from existing innocuous data, as if creating them out of thin air.

To understand how predictive analytics work, Siegel provides a wealth of examples and in-depth explanations in Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die. Understanding how organizations glean information from data and use that information helps us understand marketing and decisionmaking today. It also helps us manage our personal data.

 

Source: Siegel, Eric. Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die. Hoboken, New Jersey: John Wiley & Sons, 2013.

Image: “Women Grocery Shopping.jpg” by Bill Branson (Photographer), on Wikimedia Commons. Public Domain. 

Need data? Try Data.gov

For data about a wide variety of topics, from education to environment, Data.gov is a great source. This portal for data gathered by the U.S. government offers downloadable files that you and your students can analyze. It’s a good place to get your feet wet working with spreadsheets and data to spot patterns, form arguments, and create visualizations!

You can find examples (under the “Data” tab) to use with your students, and students can become familiar with finding and manipulating data by exploring this website and selecting data sets. Also, Data.gov demonstrates government transparency and open access to data.

Tip: Look for CSV or .xlxs files to easily download and view in spreadsheet software, like Excel and Google Sheets.

 

Image: Screenshot of Data.gov homepage.

Reading Recommendation: Data and Goliath

Where are your data stored, and who has control of your data?

The answer to this question is not always straightforward. We don’t always know whose eyes are on our data. For example, cell phone data reside on servers of private companies. A lot of information can be gleaned from data, from your location to your relationships.

Bruce Schneier writes about surveillance via data in Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World. For anyone curious about what data that companies and the government keep and monitor, it is a fascinating read.

One of Schneier’s points is about security and privacy, which pertain to data. Access to data, like cell phone logs, can reduce privacy but support security. He writes:   

[o]ften the debate is characterized as “security versus privacy.” This simplistic view requires us to make some kind of fundamental trade-off between the two: in order to become secure, we must sacrifice our privacy and subject ourselves to surveillance. And if we want some level of privacy, we must recognize that we must sacrifice some security in order to get it.

However, this contrast between security and privacy might not be necessary. Schneier goes on to point out that:

[i]t’s a false trade-off. First, some security measures require people to give up privacy, but others don’t impinge on privacy at all: door locks, tall fences, guards, reinforced cockpit doors on airplanes. When we have no privacy, we feel exposed and vulnerable; we feel less secure. Similarly, if our personal spaces and records are not secure, we have less privacy. The Fourth Amendment of the US Constitution talks about ‘the right of the people to be secure in the persons, houses, papers, and effects’… . Its authors recognized that privacy is fundamental to the security of the individual.

More generally, our goal shouldn’t be to find an acceptable trade-off between security and privacy, because we can and should maintain both together.

Schneier’s book is illuminating for considering personal data management (one of the themes for the upcoming second year of our project in 2016-2017!) in light of data use by commercial companies and government. Schneier takes a philosophical approach to discussing data, security, and privacy. He concludes with useful tips for protecting your data. Read Data and Goliath for some great food for thought!


Source: Schneier, Bruce. Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World. New York: W.W. Norton & Company, 2015.

Image: “People Lens White Eye Large” by skitterphoto.com, on Pexels. CC0 Public Domain. 

Recognizing the need for data literacy

Awareness is growing that students need instruction on interacting with data, as our project is helping librarians teach. In the prevalence of data, technology, the Internet, and digital resources, data literacy is a competency that equips students to navigate information. Education Week recently highlighted this need, including the skills to use data as arguments and understand data privacy.

Reading recommendation: The Internet of Us

With summer fast approaching, here’s a book suggestion!

I just finished The Internet of Us: Knowing More and Understanding Less in the Age of Big Data. It offers an interesting commentary on how people interact with information and big data.

Author Michael Patrick Lynch takes a philosophical approach to issues in the information age.  He writes about the difference between knowing and understanding. Have you ever been concerned about big data’s focus on the “what,” rather than the “why?” And how people say that sometimes the “what” is enough for understanding trends? Lynch recognizes this concern. He points out issues with this practice of only considering what is happening, of looking at correlations only.

Lynch asserts that three aspects compose big data:

  1. the volume of data,
  2. analysis of that data,
  3. and uses of that data by big companies.

He also discusses the dangers of decreased privacy owing to the creation of data through our activities and the use of it by companies.

Data analysis is impossible without context, according to Lynch. This point feeds his conclusion that knowing how parts connect with the whole is key to being a responsible “knower.” People need to see how information that they find online fits with their broader knowledge and the world. Seeing this bigger picture allows them to be creative. As he writes:

…our digital form of life tends to put more stock in some kinds of knowing than others. Google-knowing has become so fast, easy and productive that it tends to swamp the value of other ways of knowing like understanding. And that leads to our subtly devaluing these other ways of knowing without our even noticing that we are doing so–which in turn can mean we lose motivation to know in these ways, to think that the data just speaks for itself. And that’s a problem–in the same way that our love affair with the automobile can be a problem. It leads us to overvalue one way to get to where we want to go, and as a result we lose sight of the fact that we can reach our destinations in other ways–ways that have significant value all their own. (p. 179-80)

The Internet of Us shows both the pros and cons of technology and big data. It is not an anti-technology book. Instead, Lynch raises awareness of modern practices. Lynch’s distinction between knowing by searching online and actually developing skills is something that’d we’d all do well to remember. For those of you who are looking for inspiration — and points to make when students wonder why they have to learn something when they can just find the information online — this book is for you!

 

Source: Lynch, Michael Patrick. The Internet of Us: Knowing More and Understanding Less in the Age of Big Data. New York: Liveright Publishing Corporation, 2016.

Image: “Photo of Holloways Beach, QLD, Australia” by Alexander Khimushin, on Wikipedia. CC BY-SA 3.0.

Team member Debbie Abilock on online youth privacy and big data

This post is Part 2 in a two-part series highlighting our team members’ work with Choose Privacy Week. This initiative of the American Library Association puts a spotlight on issues of privacy in today’s digital world, such as tracking in online searches. Knowing how your data are used is a component of data literacy, and we are excited to feature our team members’ blog posts on these topics.

Debbie wrote with Rigele Abilock about online privacy policies and data collection on the Choose Privacy Week blog. Data collection by vendors can affect students, as they explain:

Reconciling big data opportunities with healthcare privacy concerns is the same dilemma we face in education. Instructors want to support personalized learning, instruction, and classroom management with online offerings – but the data of underage patrons hangs in the balance. Just as health profiling based on longitudinal data collection raises red flags, so does educational performance profiling. Ethically and practically, youth will always be our Achilles Heel.

Knowing what data vendors are collecting can be difficult to discern. Debbie and Rigele advise a close examination of their Terms of Service and Privacy Policy:

The Privacy Policy is an on-the-ground description of how the vendor operates its site, and should be read in conjunction with the Terms of Service.  A link to the Privacy Policy must be placed on the vendor’s homepage and/or product page.  The Privacy Policy is a working picture of the company’s current and expected practices related to data use, collection, and sharing, as well as marketing, advertising, access, and security control. While a Privacy Policy lacks the contractual element of a click-through signature, it remains the primary declaration of the company’s privacy practices, and thus may be enforceable against a vendor that breaches those standard practices. Through a close reading of the Privacy Policy, you should be able to learn a great deal about a vendor’s privacy standards; if the language is overly complex or contorted, treat that as indicative of what a vendor may want you to know, or not.

And so we come to intention. Close reading of a Terms of Service and Privacy Policy must be augmented by your common-sense evaluation of a vendor’s corporate intention. Both for-profit and non-profit entities may choose to embed trackers into Web pages to collect information such as navigation patterns and preferences. Certain trackers, such as Facebook’s “like” thumb and Twitter’s blue bird, are visible, but most are hidden.  Sometimes these trackers follow the user to other sites to gain additional insight, in order to create a better user experience. Specifically, trackers may run tests on differences in language and image use, look for ways to improve navigation, and fix technical problems.

Check out their post for some practical tips on monitoring what information vendors collect!

Image: “Freedom from Surveillance — Choose Privacy Week 2012,” American Library Association, on Choose Privacy Week

In the News: Internet privacy

In what ways do you limit your data sharing? Do you join or avoid loyalty rewards programs that track your habits? Do you block or regularly clean out cookies in your browser? Those steps are some areas where you have control over your information. Yet, data sharing to third parties is sometimes out of your hands or buried in the fine print of services that you use.

This last week, the Federal Communications Commission proposed new rules to give you a choice to opt out of data sharing to third parties by your Internet service provider. While this rule does not apply to sharing by websites, as critics point out, it does take a step toward consumer control of data sharing in the United States. It will be interesting to see what comes of this possibility!

Image: “Binary Map Internet Technology World Digital” by Pete Linforth on Pixabay. CC0 Public Domain. https://pixabay.com/en/binary-map-internet-technology-1012756/