The Innovative Ideas in Data Science (IID) workshop aims to provide a venue for researchers and practitioners from both academia and industry to discuss innovative, thought-provoking, and visionary ideas in data science. The emphasis is on potentially disruptive research directions that challenge current research agendas and suggest future ones.
Many workshops associated with The Web Conference have become more like mini-conferences themselves. Our vision for IID is complementary: we will seek early-stage work on blue-sky, high-risk/high-reward research, where the authors can benefit from community feedback.
IID will be a half-day workshop on Monday, Apr 20, at The Web Conference 2020. As the conference has gone online, and to maximize universal benefit, we decided to start the IID workshop at 14:00 GMT+00:00 (i.e., 7am Pacific time, 10am Eastern time, 4pm Central European time, 10pm Taiwan time; all on April 20). With these times, people across the world will be able to attend.
We open the workshop to anyone who would like to attend, for free. This way, the pandemic has at least one positive side affect, by spreading knowledge in addition to viruses. If you are not registered through the conference, register here (we have 200 slots).
We are proud to have Amazon as the Headline supporter of IID 2020!
In total, 7 papers were accepted at IID 2020, for either oral presentation or poster. Proceedings coming soon.
|Times are in the GMT+00:00 time zone|
Featured Papers(time allocation: 10 each, including questions)
|14:35||Short creativity activity|
User-centric Privacy in an ML-Ubiquitous Society
Ben Zhao is the Neubauer Professor of Computer Science at University of Chicago. He completed his PhD from Berkeley (2004) and his BS from Yale (1997). He is an ACM distinguished scientist, and recipient of the NSF CAREER award, MIT Technology Review's TR-35 Award (Young Innovators Under 35), ComputerWorld Magazine's Top 40 Tech Innovators award, Google Faculty award, and IEEE ITC Early Career Award. His work has been covered by media outlets such as Scientific American, New York Times, Boston Globe, LA Times, MIT Tech Review, and Slashdot. He has published more than 160 publications in areas of security and privacy, networked systems, wireless networks, data-mining and HCI (H-index 65). He served as TPC co-chair for the World Wide Web Conference (WWW 2016) and the ACM Internet Measurement Conference (IMC 2018), and is General Co-Chair for ACM HotNets 2020.
The impact of deep learning and its many applications on our lives is undeniable. Today, much of the work in the ML community is focused on developing techniques and algorithms to make them more powerful. Yet as ML models become more powerful, there is increasing evidence that these models are slowly eroding individual privacy of the citizens they affect. Governments, companies, and even nation states can use online data to build powerful classifiers that track us and identify us, usually without any warning or notification to the targets (you and me). For example, the NY Times recently profiled Clearview.AI, a company using online photos to build facial recognition models of millions of citizens without their knowledge or authorization, simply by scraping online photos from social networks and public sources. In this talk, I’m going to argue that we have crossed a line, where the balance of power has now definitively shifted towards data-rich entities like companies and nation states and away from individual citizens. There is a real need to develop user-centric privacy protections against deep learning classifiers that try to restore the balance and increase privacy protections for individuals. I will talk briefly about Fawkes, our new work that introduces user-side tools that perturb your own images (in imperceptible ways) such that, if capture and used to build a facial recognition model against you, would produce incorrect models that misclassify you as someone else. Fawkes works with 96%-100% effectiveness on state of the art facial recognition systems from Amazon, Microsoft, and Face++. I’ll wrap up by talking about other work in this direction and why this direction of research is critical for user privacy moving forward.
Observational Supervision & Analyst Exhaust
Christopher (Chris) Ré is an associate professor in the Department of Computer Science at Stanford University. He is in the Stanford AI Lab and is affiliated with the Statistical Machine Learning Group. His recent work is to understand how software and hardware systems will change as a result of machine learning along with a continuing, petulant drive to work on math problems. Research from his group has been incorporated into scientific and humanitarian efforts, such as the fight against human trafficking, along with products from technology and enterprise companies. He cofounded a company, based on his research into machine learning systems, that was acquired by Apple in 2017. More recently, he cofounded SambaNova systems based, in part, on his work on accelerating machine learning. He received a SIGMOD Dissertation Award in 2010, an NSF CAREER Award in 2011, an Alfred P. Sloan Fellowship in 2013, a Moore Data Driven Investigator Award in 2014, the VLDB early Career Award in 2015, the MacArthur Foundation Fellowship in 2015, and an Okawa Research Grant in 2016. His research contributions have spanned database theory, database systems, and machine learning, and his work has won best paper at a premier venue in each area, respectively, at PODS 2012, SIGMOD 2014, and ICML 2016.
As machine learning systems become more embedded in our daily lives, there is an opportunity for these systems to learn passively from our interactions with these systems. This talk discusses some rough ideas that we have been exploring including supervision obtained by instrumenting analyst software, eye trackers with radiologists and other subject matter experts, and a foray into consumer devices. As an example, gaze data is rich: it not only reveals salient portions of an image or video, but the psychology literature suggests that gaze can also convey more subtle cues, such as confidence. Moreover, trained analysts often have routine patterns, and deviation from those patterns is significant. We are exploring is the circumstances in which one can practically and provably learn from this style of supervision with minimal or no conventional supervision. This talk will be short on results and long on other people's ideas that I've found interesting.
|15:55||Fireside chat with Ben Y. Zhao and Chris Ré|
|16:20||Poster spotlight talks (3 min each)
|16:40||Virtual Poster Session|
|Submission||Friday, 24 January 2020, 23:59 Anywhere-on-Earth Time|
|Notification||Monday, 10 February, 2020|
|Camera-ready||Monday, 17 February, 2020|
|Workshop||Monday, April 20, 2020|
If authors do not want paper to appear in proceedings:
|Submission||Friday, 21 February 2020, 23:59 Anywhere-on-Earth Time|
|Notification||Friday, 6 March, 2020|
|Workshop||Monday, April 20, 2020|
All papers will be peer reviewed, single-blinded. We welcome novel research papers, work-in-progress papers, and visionary papers.
Submissions must be in PDF, written in English, no more than 4 pages long (not including references). Shorter papers are welcome. Please format your paper using the ACM SIG conference proceedings template (use sample-sigconf.pdf as the template) available here.
For accepted papers, at least one author must attend the workshop to present the work.
For paper submission, proceed to the IID 2020 submission website.