Subject Specialist
Picture: John Winsor

John Winsor
Systems Librarian and Head of Technical Services
Tel: 510-430 2066

Recent Articles from Statistical Analysis and Data Mining

Loading ...

Starting Points for Data Science
Data Science Books in the Library
Link to books on Data Science in the Mills Library catalog

SpringerLink Restricted Resource Some full text available
SpringerLink is an integrated full-text database for journals, books, protocols, eReferences, and book series published by Springer.

USA Facts Unrestricted Resource
USAFacts is a data-driven portrait of the American population, our government’s finances, and government’s impact on society.

Wiley Online Library - Computer Science Journals Restricted Resource Some full text available
Wiley Online Library hosts the world's broadest and deepest multidisciplinary collection of online resources covering life, health and physical sciences, social science, and the humanities. The Computer Science collections includes full-text access to 40+ journals.
All Items by Source

Training & Professional Development

Data Carpentry Unrestricted Resource
Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Their mission is to provide researchers high-quality, domain-specific training covering the full lifecycle of data-driven research. Data Carpentry's focus is on the introductory computational skills needed for data management and analysis in all domains of research. Lessons are domain specific, from life and physical sciences to social science and build on the existing knowledge of learners to enable them to quickly apply skills learned to their own research. Initial target audience is learners who have little to no prior computational experience.
note: Free

Full-Text Databases

JSTOR Restricted Resource Some full text available
JSTOR offers high-quality, interdisciplinary content to support scholarship and teaching. The JSTOR digital archive includes more than 1,500 leading academic journals in the humanities, social sciences, and sciences, as well as select monographs and other materials valuable for academic work.

Electronic Journals

Annals of Data Science Restricted Resource Some full text available
Annals of Data Science (AODS) is an academic journal focusing on Big Data analytics and applications. It not only promotes how to use interdisciplinary techniques, including statistics, artificial intelligence and optimization, to process Big Data and conduct data mining, but also how to use the knowledge gleaned from Big Data for real-life applications.
International Journal of Data Science and Analytics Restricted Resource Some full text available
This journal encompasses the ar­eas of data analytics, machine learning, and managing big data, as well as related new sci­entific chal­lenges ranging from data capture, creation, storage, search, sharing, analysis, and vis­ualization, to integration across heterogeneous, interdependent complex resources for real-time decision-making, collaboration, and value creation. 

eBooks

Practical Data Science Cookbook Restricted Resource
If you are an aspiring data scientist who wants to learn data science and numerical programming concepts through hands-on, real-world project examples, this is the book for you. Whether you are brand new to data science or you are a seasoned expert, you will benefit from learning about the structure of data science projects, the steps in the data science pipeline, and the programming examples presented in this book. Since the book is formatted to walk you through the projects with examples and explanations along the way, no prior programming experience is required.
Python Data Science Handbook Unrestricted Resource
This website contains the full text of the Python Data Science Handbook by Jake VanderPlas.

Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.
R for Data Science Restricted Resource
If you are a data analyst who has a firm grip on some advanced data analysis techniques and wants to learn how to leverage the features of R, this is the book for you. You should have some basic knowledge of the R language and should know about some data science topics.

Statistics/Data Sources

FRED: Federal Reserve Economic Data Unrestricted Resource
Federal Reserve Economic Data (FRED) is a database maintained by the Research division of the Federal Reserve Bank of St. Louis that has more than 421,000 economic time series from 81 sources. The data can be viewed in graphical and text form or downloaded for import to a database or spreadsheet, and viewed on mobile devices. They cover banking, business/fiscal, consumer price indexes, employment and population, exchange rates, gross domestic product, interest rates, monetary aggregates, producer price indexes, reserves and monetary base, U.S. trade and international transactions, and U.S. financial data. The time series are compiled by the Federal Reserve and collected from government agencies such as the U.S. Census and the Bureau of Labor Statistics. The economic data published on FRED are widely reported in the media and play a key role in financial markets. FRED is also a gateway to GeoFRED, a data-mapping tool that displays FRED data series in color-coded form on the state, metropolitan statistical areas and county levels.
HathiTrust Research Center Unrestricted Resource
The HathiTrust Research Center (HTRC) provides research access to the public domain corpus of the HathiTrust Digital Library. The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyber-infrastructure to enable advanced computational access to the growing digital record of human knowledge. The HTRC provides an infrastructure to search, collect, analyze, and visualize the full text of nearly 3 million public domain works and is intended for nonprofit and educational researchers.
Statistical Abstract of the U.S. Restricted Resource Some full text available
The Statistical Abstract of the United States is a one-volume, comprehensive summary of statistics on the social, political, and economic organization of the United States. Use the Abstract as a convenient volume for statistical reference, and as a guide to sources of more information both in print and on the Web.
USA Facts Unrestricted Resource
USAFacts is a data-driven portrait of the American population, our government’s finances, and government’s impact on society.
New Books in Data Science
Recent Articles from Annals of Data Science

Loading ...