Video Transcript of “Introduction to Big Data and Business Analytics” by Prof.
Erik Paolo Capistrano
Good day, everyone! This is Erik Capistrano from UP Diliman and I will be sharing with you some of the Fundamentals of Big Data and Business Analytics. This course will be divided into three aspects: first we will be dealing with the introductory part, followed by some fundamentals on database and data management, and thirdly, we will be focusing on business issues and business challenges for big data. So, what is business analytics? First of all, we will be looking at the business analytics which will be comprising of several tools and techniques that are crucial for business analytics. First of, we’ll be looking at what is business intelligence. It’s all about combining aspects of reporting, data exploration, and ad hoc queries. What is also analytics? Analytics is also part wherein we’re going to different modes of statistics, analysis, and interpretations of all these data. Moving on, the funny thing about big data is that there are several different definitions from different companies. We’ll be looking at some of them. From SAS, this includes large volumes of structured and unstructured data that inundates a business on a day-to-day basis. For IBM, however, they are saying that it is generated anywhere around us at all times, arriving from multiple sources at an alarming velocity, volume, and variety. However, Gartner will be looking at high-volume, high-velocity, and high-variety of information assets that demand cost-effective and innovative methods for us to deliver good quality data. Moving on, we’ll also be looking at what is big data as opposed to research. Research also mentions a lot of things about big data that will include it being terabytes, petabytes, and even exabytes of data. Take note, this world generates 2.5 quintillion bytes of data which are produced daily. That’s a huge amount of data we’re looking at right now. All of these are now processed into reliable, predictive, and recommendation functions based on a combination of processes, statistics, of descriptive and predictive data that are used to produce good results. Furthermore, we are now combining both proprietary customer and public data to produce more accurate information for us to use. Moving on, so what defines big data? There are four aspects that we can see here right now. First of which is volume. Volume is all about how huge the data sets are. Variety which includes how many pieces of data we gather together from social media data, government data, financial data, banking data, all sorts of transactions all combined together to make one or more profiles for your customers. Third is velocity. Velocity is the speed of data. Every day, we see hundreds and thousands of Twitter, Facebook, stock market data, and all sorts of different information coming around – Uber data for example, weather data. All of these can be combined together to make good results. Lastly, veracity – this is one of the biggest challenges of big data. Veracity means that there are a lot of uncertainty meaning, with all of these different data coming altogether but the problem is we don’t know what to do with them. They’re subject to a lot of different interpretations, with several different factors such as society, culture, and even your own gut feeling. Big data can also be defined as the following. You can group these different data into categories, different sorts of quantity, quality, and many other things. The point here is that even until now, we don’t have a really a very good definition of what big data is and what it can do so this is why it is important for us to study it further. What it can do? Some of previous research would be mentioning about the combination of data - what you see right now is a sample. Let’s say I go in a trip. I go to a trip, I take a vacation, I use my credit card to make purchases for the hotel, airlines, and whatever tickets for entrances to theme parks and whatever. What happens is that the bank captures this data, and based on that data of all what I paid for, now processes all of those transactions and makes a profile of me. In other words, the bank is starting to judge me. How can they verify that this is all about me? They can stalk my social media. Once I go to the country of destination, I post pictures, I post my check-ins, I post my statuses, I tweet, I do Instagram, and if all of these are public, then the bank can verify this is indeed me. Doing so, it can confirm that I’m the one who really went to this place, bought these stuff, and they can do a better profiling of me. Now, what is the end result? They can make better recommendations the next time I use my credit card for me to travel or even shop. That’s how it basically works. You get public data, you get proprietary data, and then you get personal data from different sources, and then you combine them together and that is what big data is all about. Some of the investments in big data that some of the large companies are providing are as follows. Productivity is said to be up by 5-6% and business growth has grown to at least 10%, which has created at least 4.4 million jobs created over the last few years. Big data also driven systems’ effects on growth. You can see some of the big increases and improvements in value and manufacturing, retail such huge amounts and we’re talking about billions. Also, on the other side, credit card firms, for example. They were able to save money because they are now protected by frauds and credit risks due to big data analytics. Furthermore, the business data value now is at 25 billion dollars and it will be up more to 53.4 billion by 2017. The economic contribution also of big data has been calculated to reach 15 trillion dollars by 2030. Lastly, the software market for big data is now valued at 16 billion dollars, growing at an annual compounded rate of 8% per year, and investments are expected to also increase to 76 billion by 2020. Lastly, who are these providers of big data? We’ve seen such a big number of this, and we’ve also seen things that are happening – so, there’s actually an entire chain of this big data services that are happening. So you can see now that there are different layers of different providers and different companies which are now producing all of these services and products and software and hardware for us to produce big data. This map tells you at least an impression of what is happening now when it comes to this big data analytics. To close out the session for now, we are looking at just a little bit of what data can do for us. So, in the next few sessions, we will be looking at some of the more fundamental aspects of how we can combine both the data management services that is required and the business practices that are also required for us to use big data properly. At the end of it all, we are hoping that we will understand how all of these work so we can better apply and manage all of these things that big data can do for us.