09/8 – Jonathan Gemmell, PhD – DePaul University
Automatic Extraction of Informal Topics from Online Suicidal Ideation
Abstract: Suicide is an alarming public health problem accounting for a considerable number of deaths each year worldwide. Many more individuals contemplate suicide. Understanding the attributes, characteristics, and exposures correlated with suicide remains an urgent and significant problem. As social networking sites have become more common, users have adopted these sites to talk about intensely personal topics, among them their thoughts about suicide. Such data has previously been evaluated by analyzing the language features of social media posts and using factors derived by domain experts to identify at-risk users. In this work, we automatically extract informal latent recurring topics of suicidal ideation found in social media posts. Our evaluation demonstrates that we are able to automatically reproduce many of the expertly determined risk factors for suicide. Moreover, we identify many informal latent topics related to suicide ideation such as concerns over health, work, self-image, and financial issues.
Bio: Jonathan F. Gemmell is an Assistant Professor in the School of Computing at DePaul University. He holds a BA in Classics, an MS in Computer Science and a PhD in Computer Science. His research focuses on the social web, data analysis and artificial intelligence. He is a faculty member at the Web Intelligence Laboratory. He has published dozens of articles in international peer-reviewed conferences and journals. His previous business experience includes trading foreign currency derivatives on the floor of the Chicago Mercantile Exchange and managing a trading and investment group.
09/15 (1 of 2) – Rami Ghannam – DePaul University
User-Targeted Denial of Service attacks
Abstract: Mobile networks are prevalent in today’s world, being used in a variety of applications ranging from personal use to the work environment and other. Ensuring security for users in a mobile network is therefore increasingly important. Denial-of-service attacks or DoS proved to be the biggest threat to mobile networks in recent years. A lot of work has been done in DoS targeting the infrastructure of the mobile network. User-targeted DoS attacks have been neglected in comparison. The fourth generation of cellular networks 4G LTE is the fastest growing mobile network in terms of subscriber numbers. The security aspect of mobile networks has improved throughout the generations, however, 4G proved to still have vulnerabilities in the signaling plane that allow a malicious attacker to target a specific user. In particular, the Attach Request procedure and the Tracking Area Update (TAU) procedure could be exploited. Deploying a rogue base station and forcing the targeted user to connect to it is therefore possible. The attacker could then deny selected services of the targeted user such as LTE data communication.
Bio: Rami Ghannam has a Bachelor’s and Master’s degree in Computer and Communications Engineering. He worked for two years as a network software engineer conducting the installation and quality control process for PBX platforms. He joined DePaul University in 2012 to pursue his PhD in Computer Science. His research interests include Computer Communications Networks, Software Defined Networks and Mobile Networks.
09/15 (2 of 2) – Himan Abdollahpouri – DePaul University
Effective Exploration Exploitation Trade-off in Sequential Music Recommendation
Abstract: Personalization is an essential part of the recommender systems. That is, tailoring the recommended items based on the tastes and interests of the end user. However, in many real world applications, there is a tremendous need to explore a broader range of items for a variety of reasons such as finding out more about what a user would like and what s/he wouldn’t, giving the opportunity to different items to be exposed to users and, lack of available items that match the user’s immediate preferences, to name just a few.
Music recommendation has become very popular in recent years due to its great performance in terms of helping users to find interesting songs in an easy-to-use manner. Similar to many other recommendation domains, explorations is very important in music recommender systems as it allows the system to learn more about the user and, at the same time, achieve more information about certain items that have not been rated enough. To the best of our knowledge, exploration in recommender systems has been done mostly at random. That is, there is no timing strategy to do exploration versus exploitation. In this project, we show that the previous sequence of the played content in a sequential music recommendation is important in deciding whether an explore item should be recommended or an exploit one.
Bio: Himan Abdollahpouri is a PhD candidate at the Web Intelligence lab at DePaul University. Himan holds an MSc in Artificial Intelligence and a BSc in Computer Engineering from Iran University of Science and Technology and Bu Ali Sina University, respectively. Prior to joining DePaul, Himan was a senior software engineer at TOSAN Inc., the most leading company in software engineering and payment/banking industry in Iran. While in the U.S., Himan did an internship as a data scientist in Summer 2017 at Pandora Media Inc. to help the company improve their user engagement via more efficient music recommendation strategies. Himan has several publications in the most prestigious venues in recommender systems such as ACM RecSys and ACM UMAP. Himan’s research interests primarily lie in machine learning, recommender systems, and data mining. In addition, he is also interested in psychology and sociology and how these two could be combined with machine learning to have better user modeling.
09/22 – Jonathan Gemmell – DePaul University
Minimum Constraint Removal Problem, Part 1
Abstract: Given a set of obstacles and two designated points in the plane, the Minimum Constraint Removal problem asks for a minimum number of obstacles that can be removed so that a collision-free path exists between the two designated points. In this work, we extend the study of Minimum Obstacle Removal. We show that the problem remains NP-hard in the two cases: (1) when all the obstacles are axes-parallel rectangles, and (2) when all the obstacles are line segments such that no three intersect at the same point. Our results improve the results of Erickson and LaValle and answer some of their open questions. As a byproduct of our NP-hardness reductions, we prove that, unless the Exponential-Time Hypothesis (ETH) fails, Minimum Constraint Removal cannot be solved in subexponential time 2^{o(n)}, where n is the number of obstacles in the instance. This shows that significant improvement on the brute-force 2^{O(n)}$-time algorithm is unlikely.
We then present a subexponential-time algorithm for instances of Minimum Constraint Removal in which the number of obstacles that overlap at any point is constant; the algorithm runs in time 2^{O(sqrt{N})}, where N is the number of the vertices in the auxiliary graph associated with the instance of the problem. We show that significant improvement on this algorithm is unlikely by showing that, unless ETH fails, Minimum Constraint Removal with bounded overlap number cannot be solved in time 2^{o(sqrt{N})}. We describe several exact algorithms and approximation algorithms that leverage heuristics and discuss their performance in an extensive empirical simulation.
Bio: Jonathan F. Gemmell is an Assistant Professor in the School of Computing at DePaul University. He holds a BA in Classics, an MS in Computer Science and a PhD in Computer Science. His research focuses on the social web, data analysis and artificial intelligence. He is a faculty member at the Web Intelligence Laboratory. He has published dozens of articles in international peer-reviewed conferences and journals. His previous business experience includes trading foreign currency derivatives on the floor of the Chicago Mercantile Exchange and managing a trading and investment group.
9/29 – Iyad Kanj – DePaul University
Minimum Constraint Removal Problem, Part 2
How to navigate a robot through obstacles?
Abstract: We consider the following motion planning problem: Given a set of obstacles in the plane, can we navigate a robot between two designated points without crossing more than k different obstacles? Equivalently, can we remove k obstacles so that there is an obstacle-free path between the two designated points?
As you learned last Friday, the above problem is NP-hard, even when each obstacle is a line segment. The problem can be formulated and generalized into the following graph problem: Given a planar graph G whose vertices are colored by color sets, two designated vertices s, t in V(G), and a natural number k, is there an s-t path in G that uses at most k colors? If each obstacle is connected, the resulting graph from this formulation satisfies the property that each color induces a connected subgraph of G.
In this talk, we discuss the parameterized complexity of the above graph problem. We first show that, without the color-connectivity property, the problem is parameterized intractable, and sits high in the parameterized complexity hierarchy. We show that even very restricted slices of the problem remain parameterized intractable. We then shift our attention to instances in which each color is connected. We exploit the planarity of the graph and the connectivity of the colors to design a parameterized algorithm for the problem w.r.t. the combined parameters k and the treewidth of the graph. Finally, we discuss applications of this parameterized algorithm to the geometric motion planning problem.
This is joint work with Eduard Eiben at TU-Wien.
Bio: Iyad Kanj is a Professor in the School of Computing at CDM. He obtained his Ph.D. from Texas A&M University in 2001, and joined DePaul the same year. His research area is algorithms and complexity theory, and his research interests are: parameterized complexity & exact computation, graph theory & algorithms, combinatorial optimization, and computational geometry.
10/6 – Kim Brown- General Electric
Women in Technology Keynote Address
Note the WiT workshop is scheduled for 12 – 6pm. Please RSVP:
The College of Computing, UPE, the PhD Student Council and HerCDM are proud to host the Women in Technology workshop. The event will begin with lunch and keynote speaker Kim Brown, Global head of marketing at GE Renewable Energy Digital, CEO of Centrally Human, Founder of Big Data for Women. Following the keynote, a roundtable discussion will feature a panel of six women in technology.
Tanu Malik | Professor at DePaul’s College of Computing and Digital Media
Jennifer Boyce | Data Scientist at Sprout Social
Simona Rollinson | CIO at the Cook County Bureau of Technology
Alison Stanton | Founder and Chief Problem Solver at Stanton Ventures
Maura Foley | Senior Applied Data Scientist at Civis Analytics
Anjana Thirumalai | Senior Product Designer at Grubhub
Schedule of Events:
12:30 to 1: Lunch
1 to 2: Keynote Speaker
2 to 2:30: Break
2:30 to 4:30: Roundtable Discussion
4:30 to 6: Networking and Hors d’oeuvres
10/13 – Gabe Fils – DePaul University
The sciunit: Making Computational Applications Reusable
Abstract: As the march of technology makes perfect connectivity a global reality, the ubiquity of collaboration in computational work becomes even more absolute. Scientists, students, professors, and professionals work distributively on computational applications, often with the help of collaborative tools. In recent years, distributed version control, as exemplified by Git, has gained acceptance as the method for grouping and sharing of data, code, and other objects related to these endeavors. We build on the concept of distributed version control to propose a versioned unit of computation, which we call a sciunit. The sciunit addresses three main issues for those working on – and especially those starting work on – collaborative projects: gathering the disparate elements required to build and describe the computation, repeating the computation to produce the exact original result, and studying the computation to acquire the understanding necessary to perform derivative work.
The sciunit client, a Linux command-line tool, creates and manages sciunits with a few short, simple commands. It uses application virtualization to bundle an application’s code, data, and dependencies into one self-contained package. A newly-created sciunit is easily shareable on any collaborative service as a distinct unit, and it is capable of instant, dependency-free execution. We make the sciunit self-documenting by collecting the large, detailed set of the application’s provenance information, and condensing it into an intuitive graphical summary of the application’s workflow. This summary is an interactive visualization that allows selective expansion of specific application components. Since developing computational applications typically entails making successive modifications to an original piece of work, we incorporate an efficient versioning and storage system into the sciunit client.
The sciunit represents an easily creatable, readily repeatable, and inherently understandable aggregation of one distinct unit of computation. Collectively, the sciunit, client, and supporting infrastructure put forth a practical model for the collaboration, verification, and development of modern computational applications.
This talk is based on the paper that will be presented at eScience 2017 in New Zealand. It can be found on arXiv.org at https://arxiv.org/abs/1707.05731.
Bio: Gabe Fils is a first-year Computer Science graduate student at DePaul University. His interests include computational reproducibility, operating system design, and research distillation.
10/20 – Elena Zheleva – University of Illinois at Chicago
Data Science in Social Spaces: Incentives, Personalization, and Privacy
Abstract: The abundance of personal data that people share online provides an opportunity to study social phenomena at a large scale. It also enables the development of a new category of information technology products, ones that are powered by data and sophisticated user models. Data science products range from personalized recommendations to fostering healthy online communities, and they bring together advances in machine learning, causal inference, and big data technologies. In this talk, I will go over my work on using data science to study sharing incentives and recommendations, and also discuss the privacy implications of machine algorithms for social media users and businesses.
Bio: Elena Zheleva is an assistant professor in Computer Science at the University of Illinois at Chicago. Her research interests span data science, machine learning, causal inference, network science, and online privacy. She has presented her research at top-tier conferences, such as KDD, WSDM, and WWW, and she is the coauthor of the book “Privacy in Social Networks.” Her experience includes building and managing a data science team at an e-commerce company, and working on initiatives at the intersection of public policy and data science at NSF. She obtained her Ph.D. in Computer Science from the University of Maryland College Park in 2011.
10/27 – Amor Montes De Oca, Director of Strategic Initiatives at 2112
2112: DePaul’s Partnered Incubator for Tech Creatives
Amor Montes de Oca is Director of Strategic Initiatives at 2112, Chicago’s first business incubator focused on the development of entrepreneurs in music, film/video and creative industry technology. Through community, educational opportunities and access to capital, 2112 creates a truly fertile ground for the professional development and acceleration of its members. Responsible for developing member engagement programs and partnerships, including spearheading and developing initiatives as part of 2112’s strategic direction and core principles. A passionate leader, combining business acumen, personable demeanor, cultivation and stewardship. Amor is a mother, travel enthusiast, flamenco aficionado, knitting novice and aspiring paintball ninja.
11/3 – Sugandha Malviya
Analyzing Real-World Queries to Support Requirements Engineering
Abstract: Requirements Engineering (RE) is a vital process for creating high quality software systems. It is comprised of various tasks related to discovering, documenting, and maintaining different kinds of requirements. A plethora of different tools, methods, and techniques are needed to successfully perform these tasks; however, even with proper tool-support in place, they can be time-consuming and difficult to perform.
One of the major obstacles in supporting RE tasks stems from the fact that information needs to be collected and consolidated from many different and diverse data sources. Typically, such artifacts are neither managed by a single person, nor stored in a single location, but distributed across multiple repositories (e.g., document management systems, source-code repositories, issue trackers, or application life-cycle management (ALM) services). Furthermore, artifact data is sometimes incomplete, inconsistent and important trace links can be missing. Accessing these data sources and combining them to produce meaningful and desired results can therefore be considered a cumbersome and error-prone endeavor.
Analyzing real-world queries can shed light on the questions requirements professionals would like to ask and the artifacts needed to support their questions. Furthermore, using the analyzed information, project strategies can be defined upfront, thereby allowing a requirements engineer to proactively instrument their environments with supporting tools, and strategically collect data that is needed to answer the queries of interest to their project.
Bio: Sugandha Malviya is a PhD Candidate at CDM, Depaul University. Her primary research interests include requirements engineering, requirements traceability and natural language queries. She has a number of publications in major requirements engineering conferences, including RE and REFSQ. She received her Master’s degree in Computer Science and Engineering, and was also awarded a gold medal for her outstanding academic performance. She completed her bachelor’s degree with honors in Information Technology. Previously, she has worked in a research organization and a teaching Institute.
11/10 – Joe Chesak
The Power of Rich Modelling (Capturing Reality as it Exists in the Wild)
Abstract: When we model a problem we aim to strip away all but what is core: the key features and mechanisims around the problem, and measures of those. From that point only those measurements get passed forward for inspection. In industry modeling often takes a hit because reaching a decision can outweigh what the decision is. It can be as truncated as “How can we get this into a spreadsheet? In that case, if it can’t be shoehorned into a spreadsheet it will not influence the analysis.
Luckily for that imperfect world our data processing toolset has advanced rapidly, and one major area is in data store technology. NoSQL solutions–those challenging the relational database status quo–have provided specialized data stores. A very interesting member of that group is the graph database, which is structured as a nodal network rather than a set of tables or collections of documents.
Using the graph database Neo4j, we will demonstrate how working with a graph database changes the model defintion phase, allows for a liberal inclusion of problem context for data analysis, and provides a freedom of expression when working with non-tech team contributors. Joe will discuss his own learnings along the way.
Bio: Joe Chesak has a BA in Biology and MBA in Decision Systems and Marketing. He has worked 18 years in data centric IT roles, mostly in the US where he worked with companies including Microsoft, Fujitsu, Deloitte & Touche, and several start-ups. For the past 10 years and now, he has lived in Norway, working mainly in the oil industry where he has managed streaming sensor data from offshore oil platforms, built reporting tools and helped architect data-warehouses. Joe’s experience from a wide range business environments has allowed him to innovate by cross-pollination, borrowing from one industry to the next. Joe currently is Chief Data Officer at Bolder Technology, with work centered around data modeling and platform tooling.