meta data for this page

What is Big Data?

A meme and a marketing term, for sure, but also shorthand for advancing trends in technology that open the door to a new approach to understanding the world and making decisions. There is a lot more data, all the time, growing at 50 percent a year, or more than doubling every two years, estimates IDC, a technology research firm. It’s not just more streams of data, but entirely new ones. For example, there are now countless digital sensors worldwide in industrial equipment, automobiles, electrical meters and shipping crates. They can measure and communicate location, movement, vibration, temperature, humidity, even chemical changes in the air.

“Big Data are high-volume, high-velocity, and/or High-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization” (Gartner 2012)

Complicated (intelligent) analysis of data may make a small data “appear” to be “Big”.:)


Ethic of BIG DATA

important part of Big Data Ethics, but so too must the establishment of ethical principles and best practices that guide government agencies, corporate actors, data brokers, information professionals, and individual humans, whether we label them “Chief Privacy Officer,” “Civil Liberties Engineer,” “system administrator,” “employee,” or “user.” Individuals certainly share responsibility for ethical data usage and development, but the failure of the privacyself-management system shows that we must build structures that encourage ethical data usage rather than merely nudging individual consumers into sharing as much as possible for as little as possible in return. Big Data Ethics are as much a state of mind as a set of mandates. While engineers in particular must embrace the idea of Big Data Ethics, in an information society that cares about privacy, we must all be part of the conversation and part of the solution.[1]

Organizations should be thoughtful in their use of this technology; consulting widely and forming policies that record the decisions and conclusions they have come to. They will consider the wider implications of their activities including:

  1. Context – For what purpose was the data originally surrendered? For what purpose is the data now being used? How far removed from the original context is its new use? Is this appropriate?
  2. Consent & Choice – What are the choices given to an affected party? Do they know they are making a choice? Do they really understand what they are agreeing to? Do they really have an opportunity to decline? What alternatives are offered?
  3. Reasonable – Is the depth and breadth of the data used and the relationships derived reasonable for the application it is used for?
  4. Substantiated – Are the sources of data used appropriate, authoritative, complete and timely for the application?
  5. Owned – Who owns the resulting insight? What are their responsibilities towards it in terms of its protection and the obligation to act?
  6. Fair – How equitable are the results of the application to all parties? Is everyone properly compensated?
  7. Considered – What are the consequences of the data collection and analysis?
  8. Access – What access to data is given to the data subject?
  9. Accountable – How are mistakes and unintended consequences detected and repaired? Can the interested parties check the results that affect them?

Together these facets are called the ethical awareness framework. This framework was developed by the UK and Ireland Technical Consultancy Group (TCG) to help people to develop ethical policies for their use of analytics and big data. [2]

[1]. Richards, Neil M, and Jonathan H. King. BIG DATA ETHICS. 2014. Print. available at

[2]. Chessell, Mandy. 'Ethics For Big Data And Analytics'. IBM (2014): 1. Print. available at

Extra work for coursera diploma:

Extra work for Missed one lecture:

Exam Questions:

1. “big data not only refer to very large datasets and the tools and procedures used to manipulated and analyze them, but also to a computational turn in though and research” (Burkholder,1992) How to do think big data is related to computational turn in though and research? Reason: During this course I have read carefully about big data I understand that, the big deal with big data is we face with a large datasets which we should analyze and manipulate them and of course it helps us in our business, when I saw this sentence written by Burkholder I understood we have a lot of aspects which we can consider to using big data, I think this question make a challenge for students to think more deeply about big data.

2. Can we say that “big data is better data?” why? Reason: “social scientists have long argued that what makes their work rigorous is rooted in their systematic approach to data collection and analysis” (McClosky, 1985). so this make a challenge to thinking more carefully about amount of big data which we can use and we don’t , why this happens?

3. What is the difference between open data and open source? Reason: I think these two concepts are so similar to each other and distinguishing them needs a good understanding of open data.

4. Joel Gurin in open data book wrote “one of the benefits of data control is privacy”, do you agree with? Why and why not? Reason: I think privacy is the important thing which we have lost with open data, but Joel Gurin saw the positive aspect, is there really a positive aspect of privacy in open data?