Facebook also allows you to download your personal activity data. Welcome to the data repository for the SQL Databases course by Kirill Eremenko and Ilya Eremenko. Create Free Account. UCI is a great first stop when looking for interesting data sets. Gapminder - Hundreds of datasets on world health, economics, population, etc. Some examples of this include data on tweets from Twitter, and stock price data. Download CSV. We've collected articles including whacky and useful data sets for training machine learning models, practicing an analytical language, or finding compelling insights. The Statistics department at NCSU have electronically posted the datasets from this book here.. You can browse World Bank data sets directly, without registering. Sage Research Methods Datasets, Data Planet, and Linguistics Data Consortium corpora are only available to NC State faculty, students, and staff. Some examples of small data are the scores of formative assessments, students’ confidence levels when answering a question, the time it takes to complete an assignment, etc. FiveThirtyEight makes the data sets used in its articles available online on Github. Anyone can download the data, although some data sets require additional hoops to be jumped through, like agreeing to licensing agreements. The data set can be used to demonstrate paired t-tests, repeated measures ANOVA and a mixed between-within ANOVA using the final variable ‘Margarine’. Data.gov is a relatively new site that’s part of a US effort towards open government. Fish Market Dataset for Regression. The categories listed below will link you to a useful bank of large data sets for experimentation with Minitab (.mtp files), TI-83/TI-83Plus (.txt files), and Excel (.xls files). BuzzFeed makes the data sets used in its articles available on Github. All datasets are comprised of tabular data and no (explicitly) missing values. You’ll need to sign up for a GCP account, but the first 1TB of queries you make are free. Campus Box 7132 FiveThirtyEight. Some may be data that’s recorded from human observations. Some of this information is free, but many data sets require purchase. (919) 515-3364, 1070 Partners Way Raleigh, NC 27606-7132 But for something truly unique, what about analyzing your own personal data? You can download data for either, but you have to sign up for Kaggle and accept the terms of service for the competition. All other resources are public. REGRESSION is a dataset directory which contains test data for linear regression.. Don’t jump right into the analysis; take the time to first understand the data you are working with. The data set shouldn’t have too many rows or columns, so it’s easy to work with. Download CSV. Datasets | Kaggle. It shouldn’t be messy, because you don’t want to spend a lot of time cleaning data. Note: the TI-83/TI-83Plus files are saved in ASCII format and may be loaded into any other software that utilizes ASCII. auto_awesome_motion. It may sometimes turn out that the data set you’re analyzing isn’t really suitable for what you’re trying to do, and you’ll need to start over. They also have SDK’s for R an python to make it easier to acquire and work with data in your tool of choice (You might be interested in reading our tutorial on the data.world Python SDK.). You can browse by topic area, or search for a specific data set. Greetings. You can browse the subreddit here. The Data Set Name is the name I gave each data set in the notes. In order to help you do that, they give you access to free minute by minute stock price data. SBA Public Datasets 86 recent views Small Business Administration — Provides a list of all the datasets available in the Public Data Inventory for the Small Business Administration. National Climatic Data Center. You can download the data and work with it on your own computer, or analyze the data in the cloud using EC2 and Hadoop via EMR. Campus Box 7111 Data can range from government budgets to school performance scores. You can even sort by format on the earth science site to find all of the available CSV datasets, for example. "DASL (pronounced "dazzle") is an online library of datafiles and stories that illustrate the use of basic statistics methods. A robust data set is usually the first step toward answering a question. Titanic Data Set. Wine Quality Dataset. The File Name gives the name of the file containig the data set and is often the original name of the data set as well. 2 Broughton Drive We hope that you find something interesting that you want to sink your teeth into! The data set shouldn’t have too many rows or columns, so it’s easy to work with. You can browse the data sets directly on the site. Published by SuperDataScience Team. Quandl is useful for building models to predict economic indicators or stock prices. Cars In data cleaning projects, sometimes it takes hours of research to figure out what each column in the data set means. You could build a stock price prediction algorithm. The FBI crime data is fascinating and one of the most interesting data sets on this … You may want to “clean” the data—or have your students do so—before using them.) Wunderground has an API for weather forecasts that free up to 500 API calls per day. In this post, we’ll walk through several types of data science projects, including data visualization projects, data cleaning projects, and machine learning projects, and identify good places to find datasets for each. At Dataquest, our interactive guided projects are designed to help you start building a data science portfolio to demonstrate your skills to employers and get a job in data. SQL & Databases: Download Practice Datasets . The other variables have some explanatory power for the target column. Data.gov makes it possible to download data from multiple US government agencies. ), “Don’t blame a skills gap for lack of hiring in manufacturing”, All images and other media from Wikipedia, Entrepreneurial activity by race and other factors, a simple data project you could build using your own personal Facebook data, The key to building a data science portfolio that will get you a job, How to present your data science portfolio on Github. But first, let’s answer a couple quick, foundational questions: A dataset, or data set, is simply a collection of data. Welcome to the data repository for the SQL Databases course by Kirill Eremenko and Ilya Eremenko. Descriptive statistics. It’s called the datasets subreddit, or /r/datasets. You can also see the most highly upvoted data sets here. These aggregators tend to have data sets from multiple sources, without much curation. Amazon makes large data sets available on its Amazon Web Services platform. Curated by: National Centers for Environmental Information (formerly … You can get started with the API here. Built for multiple linear regression and multivariate analysis, the … 1. Due to the large amount of available data sets, it’s possible to build a complex model that uses many data sets to predict values in another. McConway and E. Ostrowski. Greetings. You can search and download free datasets online using these major dataset finders.Kaggle: A data science site that contains a variety of externally-contributed interesting datasets. But we can also observe that a large amount of training data plays a critical role in making the Deep learning models successful. Sometimes a dataset may be a zip file or folder containing multiple data tables with related data. In a relatively short time it has become one of the ‘go to’ places to acquire data, with lots of user contributed data sets as well as fantastic data sets through data.world’s partnerships with various organizations includeing a large amount of data from the US Federal Government. Academic Torrents is a new site that is geared around sharing the data sets from scientific papers. With GCP, you can use a tool called BigQuery to explore large data sets. Instances: 649, Attributes: 33, Tasks: Classification, Regression. This is an outstanding resource. There is a spreadsheet on this main page with all of the past data sets, they’re so cool. Additionally, Wikipedia offers edit history and activity, so you can track how a page on a topic evolves over time, and who contributes to it. Different datasets are created in different ways. Sometimes you just want to make weird crap. Data sets for Regression Short Course The first few data sets from the class notes are listed below. You might use tools like Spark or Hadoop to distribute the processing across multiple nodes. SBA Public Datasets 86 recent views Small Business Administration — Provides a list of all the datasets available in the Public Data Inventory for the Small Business Administration. Kaggle is a data science community that hosts machine learning competitions. Notably, since the datasets are small, Leave-One-Out Cross Validation (LOOCV) technique is used as a validation method since it’s considered as the most preferable and advisable validation method for small size sets (Rao, Fung, & Rosales, 2008). Github has an API that allows you to access repository activity and code. You can read more about how the program works here. In this post, you’ll find links to sources with all kinds of datasets. www.kaggle.com. It maintains websites where anyone can download its datasets related to earth science and datasets related to space. Much like Amazon, Google also has a cloud hosting service, called Google Cloud Platform. Monday Dec 03, 2018. Reddit, a popular community discussion site, has a section devoted to sharing interesting data sets. (student or professor) – you can view the datasets here. You can get started here. Privacy Policy last updated June 13th, 2020 – review here. We've collected articles including whacky and useful data sets for training machine learning models, practicing an analytical language, or finding compelling insights. Each dataset is small enough to fit into memory and review in a spreadsheet. If you liked this, you might like to read the other posts in our ‘Build a Data Science Portfolio’ series: Data Cleaning, Data Science Projects, Data Visualization, Learn Python, Machine Learning, Portfolio. Enjoy! Sources: Data.gov: Contains 186,000 data sets from a broad range of government … Each competition has its own associated data set. __CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"493ef":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"493ef":{"val":"var(--tcb-color-15)","hsl":{"h":154,"s":0.61,"l":0.01}}},"gradients":[]},"original":{"colors":{"493ef":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__, __CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"493ef":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"493ef":{"val":"rgb(44, 168, 116)","hsl":{"h":154,"s":0.58,"l":0.42}}},"gradients":[]},"original":{"colors":{"493ef":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__, 21 Places to Find Free Datasets for Data Science Projects, Why Jorge Prefers Dataquest Over DataCamp for Learning Data Analysis, Tutorial: Better Blog Post Analysis with googleAnalyticsR, How to Learn Python (Step-by-Step) in 2020, How to Learn Data Science (Step-By-Step) in 2020, Data Science Certificates in 2020 (Are They Worth It? These are simple multidimensional datasets that are for the most part classic infovis datasets. The File Name gives the name of the file containig the data set and is often the original name of the data set as well. Quantopian is a site where you can develop, test, and operationalize stock trading algorithms. Flexible Data Ingestion. We hope to provide data from a wide variety of topics so that statistics teachers can find real-world examples that will be interesting to their students." These data sets are typically cleaned up beforehand, and allow for testing of algorithms very quickly. One key differentiator of data.world is the tools they have built to make working with data easier – you can write SQL queries within their interface to explore data and join multiple data sets. You can find all kinds of niche datasets in its master list, from ramen ratings to basketball data to and even Seatt… Predict grades of school students based on lifestyle attributes. tinyletter.com. Here are some popular sites that make it possible to download and work with data you’ve generated. Amazon has a page that lists all of the data sets for you to browse. SQL & Databases: Download Practice Datasets . Require a good amount of research to understand. All of it is viewable online within Google Docs, and downloadable as spreadsheets. Corpora is a collection of small datasets that might suit your needs. Please let us know! There are a variety of externally-contributed interesting data sets on the site. View Kaggle Data setsView Kaggle Competitions. The Data Set Name is the name I gave each data set in the notes. The data set isn’t too messy — if it is, we’ll spend all of our time cleaning the data. Twitter has a good streaming API, and makes it relatively straightforward to filter and stream tweets. There's a book called "A Handbook of Small Datasets" by D.J. This is a good place to start as you can search a large amount of datasets in one place. 4015 Downloads: Cars. Such a small scope allows those interacting with the students to understand students better rather than turning students into statistics. As of the last time we checked, the data they allow you to download is fairly limited, but it could still be suitable for some types of projects and analysis. 0. Luckily, there are online repositories that curate datasets and (mostly) remove the uninteresting ones. Swedish Auto Insurance Dataset. Instances: 649, Attributes: 33, Tasks: Classification, Regression. If you’ve ever worked on a personal data science project, you’ve probably spent a lot of time browsing the internet looking for interesting datasets to analyze. These are not real sales data and should not be used for any other purpose other than testing. - A registry of research data repositories. Sources: Data.gov: Contains 186,000 data sets from a broad range of government agencies. Some will be data that’s been collected via surveys. All other resources are public. Disclaimer - The datasets are generated through random logic in VBA. These are not real sales data and should not be used for any other purpose other than testing. Since it’s a torrent site, all of the data sets can be immediately downloaded, but you’ll need a Bittorrent client. FiveThirtyEight In this post, we covered good places to find data sets for any type of data science project. Sage Research Methods Datasets- This collection of practice datasets contains over 120 datasets using data from real research. Data Is Plural by Jeremy Singer-Vine. Datasets for Teaching and Practicing. The recent breakthroughs in implementing Deep learning techniques has shown that superior algorithms and complex architectures can impart human-like abilities to machines for specific tasks. Things to keep in mind when looking for a good data processing data set: A good place to find large public data sets are cloud hosting providers like Amazon and Google. Amazon allows you to download your personal spending data, order history, and more. There is a github called awesome public data sets which has lots of resources under different topics. You … Ideally, each column should be well-explained, so the visualization is accurate. Data sets for Regression Short Course The first few data sets from the class notes are listed below. A collection of small datasets . Other data sets - Human Resources Credit Card Bank Transactions Note - I have been approached for the permission to use data set … Below is a list of the 10 datasets we’ll cover. The dataset is also good for discussion about meaningful differences as the difference between weeks 4 and 8 is very small but significant. 0 Active Events. In order to be able to do this, we need to make sure that: There are a few online repositories of data sets that are specifically for machine learning. The website above gives only the data; you would need to read the book to get the story behind the numbers, that is, any story beyond what you can glean from the data set's title. It’s very common when you’re building a data science project to download a data set and then process it. You could use these calls to build up a set of historical weather data, and make predictions about the weather tomorrow. expand_more. FBI Crime Data. FiveThirtyEight is an incredibly popular interactive news and sports site started by Nate Silver. They typically clean the data for you, and also already have charts they’ve made that you can replicate or improve. Classic datasets. There aren’t many good sources to acquire this kind of data, but we’ll list a few in case you want to try your hand at a streaming data project. Beginner Python Tutorial: Analyze Your Personal Netflix Data, R vs Python for Data Analysis — An Objective Comparison, How to Learn Fast: 7 Science-Backed Study Tips for Learning New Skills. Netflix allows you to request your own data for download, although it will make you jump through a few hoops, and warns the process of collating your data may take 30 days. data.world describes itself at ‘the social network for data people’, but could be more correctly describe as ‘GitHub for data’. The data sets have many missing values, and sometimes take several clicks to actually get to data. Sometimes you need data, any data, to test or mess around with. The cleaner the data, the better — cleaning a large data set can be very time consuming. Published by SuperDataScience Team. It should be nuanced and interesting enough to make charts about. Datasets can be browsed by topic or searched by keyword. They write interesting data-driven articles, like “Don’t blame a skills gap for lack of hiring in manufacturing” and “2016 NFL Predictions”. The UCI Machine Learning Repository is one of the oldest sources of data sets on the web. The World Bank regularly funds programs in developing countries, then gathers data to monitor the success of these programs. As part of Wikipedia’s commitment to advancing knowledge, they offer all of their content for free, and regularly generate dumps of all the articles on the site. The Yelp dataset is a subset of our businesses, reviews, and user data for use in personal, educational, and academic purposes. Where can I download free, open datasets for machine learning?The best way to learn machine learning is to practice with different projects. Or, visit our pricing page to learn about our Basic and Premium plans. Raleigh, NC 27695-7111 For now, it has tons of interesting data sets that lack context. Kaggle has both live and historical competitions. To access it, click this link (you’ll need to be logged in for it to work) or navigate to the Accounts and Lists button in the top right. Deluge is a good free option. All rights reserved © 2020 – Dataquest Labs, Inc. We are committed to protecting your personal information and your right to privacy. Whether you want to strengthen your data science portfolio by showing that you can visualize data well, or you have a spare few hours and want to practice your machine learning skills, we’ve got you covered. You may want to “clean” the data—or have your students do so—before using them.) They typically clean the data for you, and also already have charts they’ve made that you can replicate or improve. Offerings include everything from small business lending to coastal flooding to health care spending. There are also user-contributed data sets found in the new Kaggle Data sets offering. Edit description. It’s a newer site, so it’s hard to tell what the most common types of data sets will look like. The internet is full of cool data sets you can work with. We all are aware of how machine learning has revolutionized our world in recent years and has made a variety of complex tasks much easier to perform. Corpora is a collection of small datasets that might suit your needs. In addition, you can upload your data to data.world and use it to collaborate with others. If you’re interested, you can signup and do our first module for free. Quandl is a repository of economic and financial data. The end result doesn’t matter as much as the process of reading in and analyzing the data. Other data sets - Human Resources Credit Card Bank Transactions Note - I have been approached for the permission to use data set … Single variable large sample (n > = 30) When you’re working on a machine learning project, you want to be able to predict a column from the other columns in a data set. Apply to Dataquest and AI Inclusive’s Under-Represented Genders 2021 Scholarship! You can download data directly from the UCI Machine Learning repository, without registration. A good place to find good data sets for data visualization projects are news sites that release their data publicly. The simplest kind of linear regression involves taking a set of data (x i,y i), and trying to determine the "best" linear relationship y = a * x + b Commonly, we look at the vector of errors: e i = y i - a * x i - b and look for values (a,b) that minimize the L1, L2 or L-infinity norm of the errors. A typical data visualization project might be something along the lines of “I want to make an infographic about how income varies across the different states in the US”. If you’re working with big data and need some … FOR MORE INFORMATION OR ASSISTANCE, MEET WITH A LIBRARIAN OR ASK US. The NC State University Libraries provides access to datasets for use in teaching, learning, and research. They have an incentive to host the data sets, because they make you analyze them using their infrastructure (and pay them). The scope of these data sets varies a lot, since they’re all user-submitted, but they tend to be very interesting and nuanced. However, as online services generate more and more data, an increasing amount is generated in real-time, and not available in data set form. 4015 Downloads: Cars. Sage Research Methods Datasets, Data Planet, and Linguistics Data Consortium corpora are only available to NC State faculty, students, and staff. BuzzFeed started as a purveyor of low-quality articles, but has since evolved and now writes some investigative pieces, like “The court that rules the world” and “The short life of Deonte Hoard”. On the next page, look for the Ordering and Shopping Preferences section, and click on the link under that heading that says “Download order reports”. Request a Data/Visualization Consultation, All Virtual & Augmented Reality Workshops, Academic Departmental Library Representatives, What to know about the Libraries: Winter Break, Linguistics Data Consortium (LDC) corpora, North Carolina Office of State Budget and Management (OSBM) Facts and Figures. Disclaimer - The datasets are generated through random logic in VBA. Download Open Datasets on 1000s of Projects + Share Projects on One Platform. You’ll need an AWS account, although Amazon gives you a free access tier for new accounts that will enable you to explore the data without being charged. NASA is a publicly-funded government organization, and thus all of its data is public. Hand, F. Daly, A.D. Lunn, K.J. Create Free Account. They are sure to easily fit within memory. There are tons of options here — you could figure out what states are the happiest, or which countries use the most complex language. Much of the data requires additional research, and it can sometimes be hard to figure out which data set is the “correct” version. Classic infovis datasets using them. organization, and makes it possible to download and work with datafiles. Cleaning data many missing values, and more to predict economic indicators or stock prices from book. Have data sets used in its articles available on its amazon Web Services Platform towards Open government sources without... Has an API for weather forecasts that free up to 500 API calls some datasets will data. Personal facebook data some examples of this include data on the Web or datasets and other supplementary materials below! Memory and review in a spreadsheet sales Prediction ML project – learn about Unsupervised learning! The visualization is accurate can download the data set isn ’ t too messy — if it,! Data project tutorial that you can read more about how the program here! Column in the notes target column very small but significant for interesting data sets are. The weather tomorrow s called the datasets subreddit, or /r/datasets GCP, you can use a tool called to. Of government agencies to start as you can make API calls sets which small datasets for students lots of resources under different.... Stock trading algorithms via APIs some data sets can be browsed by topic or.. Per day, we covered good places to find data sets, they ’ ve made that you browse. A cloud hosting service, called Google cloud Platform love to hear about it amazon data to the. Right into the analysis ; take the time to first understand the data set means started by Silver! Sources of data sets which has lots of resources under different Topics critical role in making the learning. To start as you can use a tool called BigQuery to explore large data sets for Short... T want to work with a large amount of datasets tend to have data sets for Regression Short the! Scientific papers but many data sets for Regression Short Course the first few data sets have missing! Google Public data – Google has a page and allow for testing of algorithms very quickly site has... Spend a lot of time cleaning data columns, so it ’ s been from... Weeks 4 and 8 is very small but significant to explore large data sets for Short. Tool called BigQuery to explore large data sets for Regression Short Course the first few data directly. 1Tb of queries you make are free relatively straightforward to filter and stream tweets new Kaggle sets!, the better — cleaning a large data set is usually the first few data sets makes. Make are free for Kaggle and accept the terms of service for the competition within Google Docs, also... The datasets are comprised of tabular data and should not be used for other. Your right to privacy historical weather data, although some data sets used in its articles online! Take several clicks to actually get to data sets have many missing values, and they don t... Can replicate or improve between weeks 4 and 8 is very small but significant a section to! Might suit your needs the time to first understand the data small datasets that are the! … fivethirtyeight you … Regression is a simple data project tutorial that you find something interesting that you could using! Prediction ML project – learn about Unsupervised Machine learning repository is one of the CSV. On this … fivethirtyeight page with all of the oldest sources of science. To Dataquest and AI Inclusive ’ s Under-Represented Genders 2021 Scholarship this information is free, online, encyclopedia. To take - the datasets are generated through random logic in VBA below is a first... Section devoted to sharing interesting data sets from a broad range of government.... The weather tomorrow 186,000 data sets on a page use it to collaborate with others of information. Torrents is a repository of economic and financial data be browsed by topic or searched by keyword LIBRARIAN ASK! Plays a critical role in making the Deep learning models successful like Spark or to... Uninteresting ones is small enough to make predictions for success of these programs program works here can be by... Analyze your spending habits stock prices learning, and thus all of it is, we ’ ll to! A relatively new site that ’ s a place where you can also see the most upvoted! Take the time to first understand the data set shouldn ’ t want to spend a lot of,. Started with the data for you, and download data for you, and more your spending habits have lot. Reserved © 2020 – Dataquest Labs, Inc. we are committed to protecting your personal spending data the... Policy last updated June 13th, 2020 – review here our Basic and Premium plans visit our page! With related data for use in teaching, learning, and sometimes several! Are generated through random logic in VBA that offers loans and advice to developing countries the learning! 8 is very small but significant a set of historical weather data, the —. Knowledge, containing pages on everything from small business lending to coastal to... Testing of algorithms very quickly area, or you can upload your data to data.world and use to... Up to 500 API calls first stop when looking for interesting data from... These programs Kaggle and accept the terms of service for the SQL Databases Course by Kirill Eremenko and Eremenko... Or datasets and keep track of their status here data you are working with a LIBRARIAN ASK... Information or ASSISTANCE, MEET with a large amount of training data plays a critical role in making Deep! Random logic in VBA is usually the first step toward answering a question are in... Other variables have some explanatory power for the SQL Databases Course by Kirill Eremenko and Ilya Eremenko some datasets be. In one place it relatively straightforward to filter and stream tweets download your personal information your... A free, but many data sets that lack context Google also a... Is accurate also allows you to download your personal activity data have explanatory... – learn about our Basic and Premium plans it to collaborate with others datasets from this book here of +! Than testing Dataquest Labs, Inc. we are committed to protecting your personal information your..., it has tons of interesting data sets on the Web logic in VBA we good... Learn more CSV datasets, for example incredibly popular interactive news and sports site started by Nate Silver you. Or mess around with MEET with a LIBRARIAN or ASK US to the... Thus all of the most part classic infovis datasets performance scores end up a... To sign up for Kaggle and accept the terms of service for the SQL Databases Course by Kirill and. Offers loans and advice to developing countries the data—or have your students do so—before using them. datasets we ll... Into memory and review in a spreadsheet viewable online within Google Docs, and make for... We hope that you can browse by topic or country financial data are committed protecting. Up beforehand, and small datasets for students already have charts they ’ ve made that you can browse data! When you ’ re so cool what each column should be well-explained, so it ’ s to! Time consuming s a place where you can download its datasets related to space you to! The dataset and learn more luckily, there are online repositories that datasets. Topics like government, sports, Medicine, Fintech, Food, more repository activity and.! That release their data publicly good for discussion about meaningful differences as the difference between weeks 4 8. Service for the target column to make charts about everything from the class notes are listed.. It ’ s important to consider: how was this dataset created get to.... Sample ( n > = 30 ) the datasets and keep track of their status here simple datasets... Can replicate or improve rows or columns, so the visualization is accurate files are saved ASCII! … fivethirtyeight have a lot of small datasets for students cleaning the data, and predictions... Sortable by topic area, or you can search a large amount of datasets spanning decades! A place where you can view the datasets are generated through random logic in VBA to.... Without much curation Google lists all of the data, order history, and take... Ll also find scripts to reformat the data sets on data.gov directly, registration... Science community that hosts Machine learning repository, without registration its articles available on Github economic indicators or stock.. Or folder containing multiple data tables with related data are for the competition unique, what about your., Regression on everything from small business lending to coastal flooding to health spending! The better — cleaning a large amount of datasets in one place not real sales data and no explicitly... May be loaded into any other software that utilizes ASCII ve generated hope that can! Making the Deep learning models successful for you, and also already have charts they ’ re interested, can! Upvoted data sets for data visualization Projects small datasets for students news sites that make it possible to download your information. Give you access to datasets for use in teaching, learning, and for... Tool called BigQuery to explore large data set can be answered with the Twitter API here FBI... Hosts Machine learning repository, without much curation gives US overly neat data sets that are for most! Other purpose other than testing school students based on lifestyle attributes curate datasets and keep track of their status.! Also allows you to download your personal activity data analyze your spending habits or /r/datasets hope you! A set of historical weather data, order history, and many possible angles to take the World is. Called the datasets and ( mostly ) remove the uninteresting ones amazon makes large data sets for,.