We are awash in data, and the power of data science can buoy us amid the rising tide. Data scientists, their research focused on the extraction of value from data, provide the deep expertise needed to use data for good. And at Columbia, their growing networks of interdisciplinary colleagues are working with them to channel the digital torrent into insight, solutions, and action.
Value is defined differently for different fields—for a policymaker, value can mean justifying decisions that affect our lives at all levels. In business, value can mean boosting the bottom line. For researchers, value can mean the discovery of knowledge: a scientific breakthrough, an insight about human behavior, or a new interpretation of the world around us.
As large data sets become ever more ubiquitous and ways to collect, analyze, and visualize their contents become ever more sophisticated, data science guides innovation in every field. It also provides deeper understanding of related issues of digital security, privacy, and ethics. The innovations we derive from data science will drive our cars, treat disease, and keep us safe. Data science enables exploration, discovery, prediction, and decision. Learning to see the world in new ways through data, we find answers to familiar questions and whole new kinds of questions altogether. We find solutions to urgent challenges. We pursue the full implications of data for good.
More than 90 percent of the world’s data has emerged over the last few years. Less than one percent has been analyzed. The data streams in—from sensors marking changes in climates and aging in bridges, from scanned documents, from health records, from social media and cell phone activity and GPS signals, from countless digital trails we leave as we go about our electronically enhanced lives. Just storing and protecting the data is daunting, let alone using it. What new kinds of problems will data science be able to solve? What new techniques will be invented that would not have come into existence if not for the marriage of computer science and statistics?
The rapid pace of data accumulation and of innovation in the field make data science a frontier as lacking in guideposts as it is full of promise. Driverless cars, precision medicine, climate modelling, historical archives, social networks—no aspect of life looks the same through the lens of data science.
Data science is shifting the very paradigm of research. Traditionally, researchers turn to data to hone and test their hypotheses as they pursue a predefined problem. Today, more and more, we start with large data sets to see what they can offer. We are learning to listen to the data and to explore the stories it can tell. Opening up the previously unseen, data science is a revolution that some have compared to the invention of the microscope.
Transforming disciplines requires unprecedented levels of collaboration and innovation. We are changing how we define and pursue our research agendas, how we prepare our students for leadership, and how we evaluate our impact. We are bringing data science skills to every part of the University and every realm of human activity. Fortunately, Columbia has a head start, with a track record of interdisciplinary breakthroughs tackling societal challenges spanning climate, energy, health, social justice, and more.
Drawing on strengths in computer science, statistics, and operations research, Columbia is uniquely poised to expand data science to every corner of the University. Data science collaborations are core to Columbia’s next chapter. Over the five years of The Columbia Commitment, the University is building on an early lead in data science to focus on three goals:
• Advancing the state of the art in data science;
• Transforming all fields through the application of data science;
• Ensuring the responsible, ethical use of data to benefit society.
Driving this effort is the Data Science Institute, through which more than 250 faculty members and researchers from 12 Columbia schools work together in fields from business to medicine, social work to literature, history to natural science. A Master of Science in Data Science program offers students an in depth education focused on data science. Extracurricular activities such as boot camps and hackathons give students hands-on experience with developing real-world data applications. Dynamic faculty recruiting, a post-doctoral fellowship program, and an undergraduate research program give rising stars in data science opportunities to pursue breakthroughs. Seed funded projects spark innovative research from cancer treatment to discovering new planets. And curricular innovations of the Collaboratory@Columbia, created in partnership with Columbia Entrepreneurship, bring data science education to Columbia’s undergraduates and professional students.
The Columbia Commitment to Data and Society invites donors and alumni to imagine and commit to a not-too-distant future when data science informs every part of Columbia’s pursuit of its mission, providing new ways to answer familiar questions, new ways to even ask questions, and new ways to prepare students to lead us to a better future. Together with our faculty, colleagues, and a growing number of alumni and donors, we look to realize the promise of data for good.
At Columbia, we are harnessing the power of data science across all fields to drive exploration, provide insights, and make predictions to inform better decisions. Data for good means using that power responsibly and ethically to tackle society's greatest challenges.
Researchers across Columbia affiliated with the Data Science Institute
Million tweets students analyzed to correctly predict the 2016 Presidential Election outcome
Million gigabytes (or 215 petabytes) of data encoded by Columbia scientists on a single gram of DNA
We all know, from research and anecdotal evidence, how many hours and hours most youth spend on the Internet, sharing personal information on social media platforms about who they talk to, where they live, and what they’re doing.. Read More
I study how armed groups mobilize on social media. With social media data comes something completely new to researchers.. Read More
Laura Kurgan ’88GSAPP
With my team, we’ve created an interactive, open-source map of the city of Aleppo—which has been at the center of fighting in Syria—that combines layers of high-resolution satellite images with data gathered by human rights organizations and the United Nations.. Read More
Big data transforms how we see and solve problems on a scale equivalent to the introduction of the microscope 500 years ago.. Read More
There is a long period of neurodegeneration that occurs prior to the onset of symptoms, and those symptoms aren't necessarily unique to Parkinson’s disease, so there's an even further lag after the manifestation of the symptoms before diagnosis.. Read More