IBM Research Brazil Forum 2025
- Rio de Janeiro, Brazil
IBM is proud to sponsor AAAI 2023. We invite all attendees to visit us during the event in booth 119 at the Walter E. Washington Convention Center in Washington, D.C.
We look forward to meeting you at the event and telling you more about our latest work and career opportunities at IBM Research. Our team will be presenting a series of workshops, papers and demos related to a broad range of AI topics such as foundation models, trustworthy AI, natural language processing and understanding, knowledge and reasoning, AI automation, human-centered AI, and federated learning.
Read our accepted papers:
Visit us in the Exhibit Hall at Booth 119 View the booth demo schedule here: https://ibm.biz/AAAI23_BoothDemos
For presentation times of workshops, demos, papers, and tutorials see the agenda section below. (Note all times are displayed in your local time).
Join conversations on machine learning best practices, attend education tutorials, and participate in workshops. Meet with IBM recruiting and hiring managers about future job opportunities or 2023 summer internships.
Explore all current IBM Research job openings.
Featured positions to learn more about at AAAI:
We look forward to meeting and seeing you in Washington, D.C.!
Stay connected with us for career opportunities: https://ibm.biz/connectwithus
The goal of this tutorial is to elucidate the unique and novel connections between algorithmic fairness and the rich literature on adversarial machine learning. Compared to other tutorials on AI fairness, this tutorial will emphasize the connection between recent advances in fair learning and adversarial robustness literature. The range of the presented techniques will cover a complete fairness pipeline, starting from auditing ML models for fairness violations, post-processing them to rapidly alleviate bias, and re-training or fine-tuning models to achieve algorithmic fairness. Each topic will be presented in the context of adversarial ML, specifically, (i) connections between fair similarity metrics for individual fairness and adversarial attack radius, (ii) auditing as an adversarial attack, (iii) fair learning as adversarial training, (iv) distributionally robust optimization for group fairness. We will conclude with (v) a summary of the most recent advances in adversarial ML and its potential applications in algorithmic fairness.
The tutorial is designed for a broad range of audiences, including researchers, students, developers, and industrial practitioners. Basic knowledge of machine learning and deep learning is preferred but not required. All topics will be supported with relevant background and demonstrations on varying real data use-cases utilizing Python libraries for fair machine learning.
Mikhail Yurochkin (IBM); Yuekai Sun; Pin-Yu Chen (IBM)
The AI4BPM Bridge at AAAI 2023 brings together academics and industry professionals working at the intersection of artificial intelligence and business process management under the same roof. The event will include invited talks, poster sessions, tutorials, student outreach, meet and mingle opportunities, hands-on system demonstrations, and much more!
Tathagata Chakraborti (IBM); Vatche Isahagian (IBM); Andrea Marrella; Chiara Di Francescomarino; Jung koo Kang (IBM); Yara Rizk (IBM)
Asset Health and Monitoring is an emerging AI Application that aims to deliver efficient AI-powered solutions to various industrial problems such as anomaly detection, failure pattern analysis, etc. In this lab-based tutorial, we present a web-based time series anomaly detection tool – a new scikit-learn compatible toolkit specialized for the time series-based anomaly detection problem. The key focus of our tutorial includes the design and development of an anomaly detection pipeline, a zero-configuration interface for automated discovery of an anomaly pipeline for any given dataset (univariate and multi-variate), a set of 5 frequently used workflow empirically derived from past experiences, a scalable technique for conducting efficient pipeline execution. We extensively tested deployed anomaly detection services using multiple datasets with varying time-series data characteristics.
Dhaval Patel (IBM)
We propose an NL paradigm and platform for the construction of business automation rules that incorporate a constrained natural language (CNL) – a domain-specific highly consumable language to validate and review synthesized code. Our approach utilizes LLMs to translate business rules described in NL into CNL for human review which can then be transpiled into the business automation code of the rule engine. To address challenges in the translation from NL to CNL, we utilize several techniques such as constrained decoding, fine-tuning and prompt engineering.
Michael Desmond (IBM); Vatche Isahagian (IBM); Vinod Muthusamy (IBM); Evelyn Duesterwald (IBM)
Reinforcement learning has many practical real-world applications. However, RL solutions are highly sensitive to the choice of the RL algorithm and its internal hyperparameters which requires expert manual effort. This limits the widespread applicability of RL solutions in practice. In this lab session, we introduce an automated system, AutoDO, for end-to-end solving of sequential decision-making problems. Our system automatically selects the best RL algorithm and its hyperparameters using a search strategy based on limited discrepancy search coupled with Bayesian optimization. It supports both online and offline RL as well as automated Markov Decision Process (MDP) models based on mathematical programming.
Data scientists with very little or no experience in RL will be able to formulate and solve sequential decision-making problems using AutoDO. They will focus on preparing the inputs, namely a system environment for online RL and a data set annotated with a knowledge data structure for offline RL and automated MDP models and will gain experience in using our system to automate the solution pipeline generation for the optimal decision policy. Prerequisites will be shared in advance: mainly, basic familiarity with python and free subscription in advance to a public facing API service.
Shankar Subramaniam (IBM); Takayuki Osogami (IBM); Radu Marinescu (IBM); Alexander Zadorojniy (IBM); Long Vu (IBM); Nhan Pham (IBM)
The AI4BPM Bridge at AAAI 2023 brings together academics and industry professionals working at the intersection of artificial intelligence and business process management under the same roof. The event will include invited talks, poster sessions, tutorials, student outreach, meet and mingle opportunities, hands-on system demonstrations, and much more!
Tathagata Chakraborti (IBM); Vatche Isahagian (IBM); Andrea Marrella; Chiara Di Francescomarino; Jung koo Kang (IBM); Yara Rizk (IBM)
The definition of metrics for evaluating reconstruction image data from machine learning generative methods are well established for applications involving natural images. However, machine learning models applied to weather field precipitation data in the context of weather generators are still not sufficiently addressed. In this work, we discuss the use of various metrics for weather data generation and we propose the use of the Frechet Inception Distance metric based on weights from a weather dataset.
Maysa Malfiza Garcia de Macedo (IBM); Daniela Szwarcman (IBM); Jorge Luis Guevara Diaz (IBM); Dario Augusto Borges Oliveira (IBM); Bianca Zadrozny (IBM)
Optimization with constraint learning (OCL) uniquely leverages machine learning (ML) to design optimization models in which constraints and objectives are directly learned from data whenever explicit expressions are unknown. While OCL offers great advantages to design more accurate models, in a faster way, practitioners should also be aware of possible pitfalls and inaccuracies arising from embedding fitted models as optimization constraints.
Divided into four parts, the OCL Lab offers theoretical as well as hands-on tutorials, demonstrated on a case study from the World Food Programme. Throughout the OCL Lab, participants will become familiar with two novel Python packages: (1) OptiCL to learn and embed constraints and (2) DOFramework to evaluate the optimal solutions generated by an OCL algorithm. The first two parts of the lab will provide participants with theoretical and practical knowledge for using ML models to learn constraints and objectives directly from data. The remaining two parts will be dedicated to novel quality metrics for OCL and a structured testing framework for OCL algorithms.
S. Ilker Birbil; Donato Maragno; Orit Davidovich (IBM)
Visit us at booth 119 in the exhibit hall to talk to our researchers and recruiters. We'll also be doing demos of our work. View the booth demo schedule here: https://ibm.biz/AAAI23_BoothDemos
Exhibit Hours:
This is an overview of our newly released Python package NL2LTL which leverages the latest in natural language understanding (NLU) and large language models (LLMs) to translate English inputs to linear temporal logic (LTL) formulas. Such an interface allows direct translation to formal languages that a reasoning system can use, while at the same time, allowing the end-user to provide inputs in natural language without having to understand the details of the underlying formal language in a system. The package comes with support for a set of default LTL patterns, corresponding to popular DECLARE templates, but is also fully extensible to new domains so adopters of the package can configure it to their needs. The package has just been open-sourced and is free to use for the AI community under the MIT license.
Francesco Fuggitti; Tathagata Chakraborti (IBM)
Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types. We present a front-end demo to a multilingual machine reading comprehension system that handles boolean and extractive questions. It provides a yes/no answer and highlights the supporting evidence for boolean questions. It provides an answer for extractive questions and highlights the answer in the passage. Our system, GAAMA 2.0, achieved first place on the TyDI leaderboard at the time of submission. We contrast two different implementations of our approach: including multiple transformer models for easy deployment, and a shared transformer model utilizing adapters to reduce GPU memory footprint for a resource-constrained environment.
Scott McCarley (IBM); Mihaela Bornea (IBM); Sara Rosenthal (IBM); Anthony Ferritto (IBM); Arafat Sultan (IBM); Avi Sil (IBM); Hans Florian (IBM)
Streams of irregularly occurring events are commonly modeled as a marked temporal point process. Many real-world datasets such as e-commerce transactions and electronic health records often involve events where multiple event types co-occur, e.g. multiple items purchased or multiple diseases diagnosed simultaneously. In this paper, we tackle multi-label prediction in such a problem setting, and propose a novel Transformer-based Conditional Mixture of Bernoulli Network (TCMBN) that leverages neural density estimation to capture complex temporal dependence as well as probabilistic dependence between concurrent event types. We also propose potentially incorporating domain knowledge in the objective by regularizing the predicted probability. To represent probabilistic dependence of concurrent event types graphically, we design a two-step approach that first learns the mixture of Bernoulli network and then solves a least-squares semi-definite constrained program to numerically approximate the sparse precision matrix from a learned covariance matrix. This approach proves to be effective for event prediction while also providing an interpretable and possibly non-stationary structure for insights into event co-occurrence. We demonstrate the superior performance of our approach compared to existing baselines on multiple synthetic and real benchmarks.
Xiao Shou; Tian Gao (IBM); Shankar Subramaniam (IBM); Debarun Bhattacharjya (IBM); Kristin Bennett
Visit us at booth 119 in the exhibit hall to talk to our researchers and recruiters. We'll also be doing demos of our work. View the booth demo schedule here: https://ibm.biz/AAAI23_BoothDemos
Exhibit Hours:
We present a machine learning system for forecasting forced displacement populations deployed at the Danish Refugee Council (DRC). The system, named Foresight, supports long term forecasts aimed at humanitarian response planning. It is explainable, providing evidence and context supporting the forecast. Additionally, it supports scenarios, whereby analysts are able to generate forecasts under alternative conditions. The system has been in deployment since early 2020 and powers several downstream business functions within DRC. It is central to our annual Global Displacement Report which informs our response planning. We describe the system, key outcomes, lessons learnt, along with technical limitations and challenges in deploying machine learning systems in the humanitarian sector.
Rahul Nair (IBM); Bo Schwartz Madsen; Alexander Kjærum
There has been a surge of interest in learning optimal decision trees using mixed-integer programs (MIP) in recent years, as heuristic-based methods do not guarantee optimality and find it challenging to incorporate constraints that are critical for many practical applications. However, existing MIP methods which build on an \emph{arc-based} formulation do not scale well as the number of binary variables is in the order of , where and refer to the depth of the tree and the size of the dataset. Moreover, they can only handle sample-level constraints and linear metrics. In this paper, we propose a novel \emph{path-based} MIP formulation where the number of decision variables is independent of . We present a scalable column generation framework to solve the MIP optimally. Our framework produces a multiway-split tree which is more interpretable than the typical binary-split trees due to its shorter rules. Our method can handle nonlinear metrics such as F1 score and incorporate a broader class of constraints. We demonstrate its efficacy with extensive experiments. We present results on datasets containing up to 1,008,372 samples while existing MIP-based decision tree models do not scale well on data beyond a few thousand points. We report superior or competitive results compared to the state-of-art MIP-based methods with up to a 24X reduction in runtime.
Shivaram Subramanian (IBM); Wei Sun (IBM)
We propose KnowGL, a tool that allows converting text into structured relational data represented as a set of ABox assertions compliant with the TBox of a given Knowledge Graph (KG), such as Wikidata. We address this problem as a sequence generation task by leveraging pre-trained sequence-to-sequence language models, e.g. BART. Given a sentence, we fine-tune such models to detect pairs of entity mentions and jointly generate a set of facts consisting of the full set of semantic annotations for a KG, such as entity labels, entity types, and their relationships. To showcase the capabilities of our tool, we build a web application consisting of a set of UI widgets that help users to navigate through the semantic data extracted from a given input text. We make the KnowGL model available at~\url{https://huggingface.co/ibm/knowgl-large}.
Gaetano Rossiello (IBM); Md Faisal Mahbub Chowdhury (IBM); Nandana Mihindukulasooriya (IBM); Owen Cornec (IBM); Alfio Gliozzo (IBM)
Consider a network of decentralized computing agents collaboratively solving a nonconvex stochastic composite problem. In this work, we propose a single-loop algorithm, called DEEPSTORM, that achieves optimal sample complexity for this setting. Unlike double-loop algorithms that require a large batch size to compute the (stochastic) gradient once in a while, DEEPSTORM uses a small batch size, creating advantages in occasions such as streaming data and online learning. This is the first method achieving optimal sample complexity for decentralized nonconvex stochastic composite problems, requiring batch size. We conduct convergence analysis for DEEPSTORM with both constant and diminishing step sizes. Additionally, under proper initialization and a small enough desired solution error, we show that DEEPSTORM with a constant step size achieves a network-independent sample complexity, with an additional linear speed-up with respect to over centralized methods. All codes are made available at~\url{https://github.com/gmancino/DEEPSTORM}.
Gabriel Mancino-ball; Shengnan Miao; Yangyang Xu; Jie Chen (IBM)
Many works in explainable AI have focused on explaining black-box classification models. Explaining deep reinforcement learning (RL) policies in a manner that could be understood by domain users has received much less attention. In this paper, we propose a novel perspective to understanding RL policies based on identifying important states from automatically learned meta-states. The key conceptual difference between our approach and many previous ones is that we form meta-states based on locality governed by the expert policy dynamics rather than based on similarity of actions, and that we do not assume any particular knowledge of the underlying topology of the state space. Theoretically, we show that our algorithm to find meta-states converges and the objective that selects important states from each meta-state is submodular leading to efficient high quality greedy selection. Experiments on four domains (four rooms, door-key, minipacman, and pong) and a carefully conducted user study illustrate that our perspective leads to better understanding of the policy. We conjecture that this is a result of our meta-states being more intuitive in that the corresponding important states are strong indicators of tractable intermediate goals that are easier for humans to interpret and follow.
Ronny Luss (IBM); Amit Dhurandhar (IBM); Miao Liu (IBM)
Visit us at booth 119 in the exhibit hall to talk to our researchers and recruiters. We'll also be doing demos of our work. View the booth demo schedule here: https://ibm.biz/AAAI23_BoothDemos
Exhibit Hours:
Process automation has evolved from end-to-end automation of repetitive process branches to hybrid automation where robots perform some activities and humans serve the others. In the context of knowledge-intensive processes such as IT operations, implementing hybrid automation is a natural choice where robots can perform certain mundane functions, with humans taking over the decision of when and which IT systems need to act. Recently, ChatOps, which refers to conversation-driven collaboration for IT operations, has rapidly accelerated efficiency by providing a cross-organization and cross-domain platform to resolve and manage issues as soon as possible. Hence, providing a natural language interface to robots is a logical progression to enable collaboration between humans and robots. Developers can use several ChatOps frameworks to build conversational interfaces for robots, but it requires significant development effort. This work presents a no-code approach to provide a conversational interface that enables human workers to collaborate with robots executing automation scripts. We further detail our process of mining the conversations between humans and robots to monitor performance and identify the scope for improvement in service quality. Finally, we demonstrate our deployed solution that creates robots for a ChatOps environment enabling hybrid collaboration. The robots identify the intents of users' requests and automatically orchestrate one or more relevant automation tasks to serve the request.
Jayachandu Bandlamudi (IBM); Kushal Mukherjee (IBM); Prerna Agarwal (IBM); Sampath Dechu (IBM); siyu huo (IBM); Vatche Isahagian (IBM); Vinod Muthusamy (IBM); Naveen Purushothaman (IBM); Renuka Sindhgatta (IBM)
Machine learning and deep learning-based decision making has become part of today's software. The goal of this work is to ensure that machine learning and deep learning-based systems are as trusted as traditional software. Traditional software is made dependable by following rigorous practice like static analysis, testing, debugging, verifying, and repairing throughout the development and maintenance life-cycle. Similarly for machine learning systems, we need to keep these models up to date so that their performance is not compromised. For this, current systems rely on scheduled re-training of these models as new data kicks in. In this work, we propose to measure the data drift that takes place when new data kicks in so that one can adaptively re-train the models whenever re-training is actually required irrespective of schedules. In addition to that, we generate various explanations at sentence level and dataset level to capture why a given payload text has drifted.
Nishtha Madaan (IBM); Adithya Manjunatha; Hrithik Nambiar; Aviral Kumar Goel; Harivansh Kumar (IBM); Diptikalyan Saha (IBM); Srikanta Bedathur
We apply the machinery of interventional causal learning with programmable interventions to the domain of applications management. Modern applications are modularized into interdependent components or services (e.g. microservices) for ease of development and management. The communication graph among such components is a function of application code and is not always known to the platform provider. In our solution we learn this unknown communication graph solely using application logs observed during the execution of the application by using fault injections in a staging environment. Specifically, we have developed an active (or interventional) causal learning algorithm that uses the observations obtained during fault injections to learn a model of error propagation in the communication among the components. The "power of intervention" additionally allows us to address the presence of confounders in unobserved user interactions. We demonstrate the effectiveness of our solution in learning the communication graph of well-known microservice application benchmarks. We also show the efficacy of the solution on a downstream task of fault localization in which the learned graph indeed helps to localize faults at runtime in a production environment (in which the location of the fault is unknown). Additionally, we briefly discuss the implementation and deployment status of a fault injection framework called WOLFFI which incorporates the developed technology.
Qing Wang (IBM); Jesus Rios Aliaga (IBM); Saurabh Jha (IBM); Karthikeyan Shanmugam (IBM); Frank Bagehorn (IBM); Xi Yang (IBM); Robert Filepp (IBM); Naoki Abe (IBM); Larisa Shwartz (IBM)
With the advancement of deep learning technology, neural networks have demonstrated their excellent ability to provide accurate predictions in many tasks. However, a lack of consideration for neural network calibration will not gain trust from humans, even for high-accuracy models. In this regard, the gap between the confidence of the model's predictions and the actual correctness likelihood must be bridged to derive a well-calibrated model. In this paper, we introduce the Neural Clamping Toolkit, the first open-source framework designed to help developers employ state-of-the-art model-agnostic calibrated models. Furthermore, we provide animations and interactive sections in the demonstration to familiarize researchers with calibration in neural networks. A Colab tutorial on utilizing our toolkit is also introduced.
Lei Hsiung; Yung-chen Tang; Pin-Yu Chen (IBM); Tsung-yi Ho
This demo paper discusses a scalable platform for emerging Data-Driven AI Applications targeted toward predictive maintenance solutions. We propose a common AI software architecture stack for building diverse AI Applications such as Anomaly Detection, Failure Pattern Analysis, Asset Health Forecasting, etc. for more than a 100K industrial assets of similar class. As a part of the AI system demonstration, we have identified the following three key topics for discussion: Scaling model training across multiple assets, Joint execution of multiple AI applications; and Bridge the gap between current open source software tools and the emerging need for AI Applications. To demonstrate the benefits, AI Model Factory has been tested to build the models for various industrial assets such as Wind turbines, Oil wells, etc. The system is deployed on API Hub for demonstration.
Dhaval Patel (IBM); Shuxin Lin (IBM); Dhruv Shah (IBM); Srideepika Jayaraman (IBM); Joern Ploennigs (IBM); Anuradha Bhamidipaty (IBM); Jayant Kalagnanam (IBM)
Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario,namely Single Domain Generalization (Single-DG), where only a single source domain is available for training. To tackle this challenge, we first try to understand when neural networks fail to generalize? We empirically ascertain a property of a model that correlates strongly with its generalization that we coin as "model sensitivity". Based on our analysis, we propose a novel strategy of Spectral Adversarial Data Augmentation (SADA) to generate augmented images targeted at the highly sensitive frequencies. Models trained with these hard-to-learn samples can effectively suppress the sensitivity in the frequency space, which leads to improved generalization performance. Extensive experiments on multiple public datasets demonstrate the superiority of our approach, which surpasses the state-of-the-art single-DG methods.
Jiajin Zhang; Hanqing Chao; Amit Dhurandhar (IBM); Pin-Yu Chen (IBM); Ali Tajer; Yangyang Xu; Pingkun Yan
The use of machine learning models in consequential decision making often exacerbates societal inequity, in particular yielding disparate impact on members of marginalized groups defined by race and gender. The area under the ROC curve (AUC) is widely used to evaluate the performance of a scoring function in machine learning, but is studied in algorithmic fairness less than other performance metrics. Due to the pairwise nature of the AUC, defining an AUC-based group fairness metric is pairwise-dependent and may involve both intra-group and inter-group AUCs. Importantly, considering only one category of AUCs is not sufficient to mitigate unfairness in AUC optimization. In this paper, we propose a minimax learning and bias mitigation framework that incorporates both intra-group and inter-group AUCs while maintaining utility. Based on this Rawlsian framework, we design an efficient stochastic optimization algorithm and prove its convergence to the minimum group-level AUC. We conduct numerical experiments on both synthetic and real-world datasets to validate the effectiveness of the minimax framework and the proposed optimization algorithm.
Zhenhuan Yang; Yan Lok Ko; Kush Varshney (IBM); Yiming Ying
We present nBIIG, a neural Business Intelligence (BI) Insights Generation system. Given a table, our system applies various analyses to create corresponding RDF representations, and then uses a neural model to generate fluent textual insights out of these representations. The generated insights can be used by an analyst, via a human-in-the-loop paradigm, to enhance the task of creating compelling table reports. The underlying generative neural model is trained over large and carefully distilled data, curated from multiple BI domains. Thus, the system can generate faithful and fluent insights over open-domain tables, making it practical and useful.
Yotam Perlitz (IBM); Dafna Sheinwald (IBM); Noam Slonim (IBM); Michal Shmueli-Scheuer (IBM)
Adversarial robustness studies the worst-case performance of a machine learning model to ensure safety and reliability. With the proliferation of deep-learning based technology, the potential risks associated with model development and deployment can be amplified and become dreadful vulnerabilities. This paper provides a comprehensive overview of research topics and foundational principles of research methods for adversarial robustness of deep learning models, including attacks, defenses, verification, and novel applications.
Pin-Yu Chen (IBM); Sijia Liu
Visit us at booth 119 in the exhibit hall to talk to our researchers and recruiters. We'll also be doing demos of our work. View the booth demo schedule here: https://ibm.biz/AAAI23_BoothDemos
Exhibit Hours:
We present a simple linear programming (LP) based method to learn compact and interpretable sets of rules encoding the facts in a knowledge graph (KG) and use these rules to solve the KG link completion problem. Our LP model chooses a set of rules of bounded complexity from a list of candidate first-order logic rules and assigns weights to them. The complexity bound is enforced via explicit constraints. We show how to combine simple rule generation heuristics with our rule selection LP to obtain predictions with accuracy comparable to state-of-the-art codes, even while generating much more compact rule sets. Furthermore, when we take as input rules generated by other codes, we can often improve interpretability by reducing the number of chosen rules, while maintaining accuracy.
Sanjeeb Dash (IBM); Joao Goncalves (IBM)
We introduce equi-tuning, a novel fine-tuning method that transforms (potentially non-equivariant) pretrained models into group equivariant models while incurring minimum loss between the feature representations of the pretrained and the equivariant models. Large pretrained models can be equi-tuned for different groups to satisfy the needs of various downstream tasks. Equi-tuned models benefit from both group equivariance as an inductive bias and semantic priors from pretrained models. We provide applications of equi-tuning on three different tasks: image classification, compositional generalization in language, and fairness in natural language generation (NLG). We also provide a novel group-theoretic definition for fairness in NLG. The effectiveness of this definition is shown by testing it against a standard empirical method of fairness in NLG. We provide experimental results for equi-tuning using a variety of pretrained models: Alexnet, Resnet, VGG, and Densenet for image classification; RNNs, GRUs, and LSTMs for compositional generalization; and GPT2 for fairness in NLG. We test these models on benchmark datasets across all considered tasks to show the generality and effectiveness of the proposed method.
Sourya Basu; Prasanna Sattigeri (IBM); Karthikeyan Natesan Ramamurthy (IBM); Vijil Vijil (IBM); Kush Varshney (IBM); Lav Varshney; Payel Das (IBM)
Graphical event models (GEMs) are representations of temporal point process dynamics between different event types. Many real-world applications however involve limited event stream data, making it challenging to learn GEMs from data alone. In this paper, we introduce approaches that can work together in a score-based learning paradigm, to augment data with potentially different types of background knowledge. We propose novel scores for learning an important parametric class of GEMs; in particular, we propose a Bayesian score for leveraging prior information as well as a more practical simplification that involves fewer parameters, analogous to Bayesian networks. We also introduce a framework for incorporating easily assessed qualitative background knowledge from domain experts, in the form of statements such as `event X depends on event Y' or `event Y makes event X more likely'. The proposed framework has Bayesian interpretations and can be deployed by any score-based learner. Through an extensive empirical investigation, we demonstrate the practical benefits of background knowledge augmentation while learning GEMs for applications in the low-data regime.
Debarun Bhattacharjya (IBM); Tian Gao (IBM); Shankar Subramaniam (IBM); Xiao Shou
AI technology and neuroscience have progressed such that it’s again prudent to look to the brain as a model for AI. Examining current artificial neural networks, theoretical computer science, and systems neuroscience, this workshop will uncover gaps in knowledge about the brain and models of intelligence.
Bernard Baars modeled the brain’s cognitive processes as a Global Workspace. This was elaborated in network neuroscience as the Global Neuronal Workspace, and in theoretical computer science as the Conscious Turing Machine (CTM) [1]. The CTM is a substrate independent model for consciousness. AI researchers have proposed variations and extensions of the Global Workspace, connecting the CTM to Transformers [2] and using them to communicate among specialist modules [3].
Meanwhile, neuroscience has identified large-scale brain circuits brain that bear striking resemblance to patterns found in contemporary AI architectures such as Transformers. This workshop will aim to map the Global Workspace and CTM to AI systems using the brain’s architecture as a guide. We hypothesize that this approach can achieve general intelligence and that high resolution recordings from the brain can be used to validate its models.
The goal of this workshop is to bring together a multi-disciplinary group comprising AI researchers, systems neuroscientists, algorithmic information theorists, and physicists to understand gaps in this larger agenda and to determine what’s known about what’s needed to build thinking machines.
Mark Wegman (IBM); James Kozloski (IBM)
The recent wave of using machine learning to analyze and manipulate real-world systems has inspired many research topics in the joint interface of machine learning and dynamical systems. However, the real world applications are diverse and complex with vulnerabilities such as simulation divergence or violation of certain prior knowledge. As ML-based dynamical models are implemented in real world systems, it generates a series of challenges including scalability, stability and trustworthiness.
Through this workshop, we aim to provide an informal and cutting-edge platform for research and discussion on the co-development between machine models and dynamical systems. We welcome all the contributions related to ML based application/theory on dynamical systems and solution to ML problem from dynamical system perspective.
From an alternative perspective, many machine learning problems can be viewed as dynamical systems, with examples ranging from neural network forward propagation to optimization dynamics and countless problems with sequential data. These generate increasing interest to study the intrinsic, evolving dynamics of these problems, with the potential to come up with novel methodologies for theory development and their applications.
The mission of the MLmDS workshop is to bring together researchers from diverse backgrounds including but not limited to artificial intelligence and dynamical systems, gathering insights from these fields to facilitate collaboration and adaptation of theoretical and application knowledge amongst them.
Lam Nguyen (IBM); Trang H. Tran; Wang Zhang; Subhro Das (IBM); Lily Weng
Deep Learning models are at the core of research in Artificial Intelligence research today. It is well- known that deep learning techniques that were disruptive for Euclidean data such as images or sequence data such as text are not immediately applicable to graph-structured data. This gap has driven a tide in research for deep learning on graphs on various tasks such as graph representation learning, graph generation, and graph classification. New neural network architectures on graph-structured data have achieved remarkable performance in these tasks when applied to domains such as social networks, bioinformatics and medical informatics.
This one-day workshop aims to bring together both academic researchers and industrial practitioners from different backgrounds and perspectives to the above challenges. The workshop will consist of contributed talks, contributed posters, and invited talks on a wide variety of the methods and applications. Work-in-progress papers, demos, and visionary papers are also welcome. This workshop intends to share visions of investigating new approaches and methods at the intersection of Graph Neural Networks and real-world applications. It aims to bring together both academic researchers and industrial practitioners from different backgrounds to discuss a wide range of topics of emerging importance for GNN.
Lingfei Wu; Jian Pei; Jiliang Tang; Yinglong Xia; Xiaojie Guo (IBM)
Fluid mechanics continues to advance quickly in the age of artificial intelligence, mainly due to the abundance of experimental data, field data assimilation, and high-fidelity multiscale simulations. Among the many data-driven approaches recently applied to such a discipline, ML-based reduced-order models (ROMs) have received particular attention because of their algorithmic simplicity, explainability, and computational efficiency. In this work, we have devised and implemented an ML-based ROM which combines dimensionality reduction via an Encoder-Decoder (ED) neural network with forecasting capabilities in latent space using Deep Neural Operators (DeepONets). We assessed the proposed architecture with a spatiotemporal dataset generated by the numerical solution of the Rayleigh-Bénard convection (RBC) problem. The reconstruction error of the model over the unseen datasets was lower than 10 %, demonstrating the ED technique's accurate spatial representation and the neural operators' robustness in estimating future system states. This work represents a solid contribution to the fluid dynamics community with an accurate and efficient ML-based model to tackle the challenging well-known RBC problem.
Joao Lucas de Sousa Almeida (IBM); Pedro Rocha (IBM); Allan Carvalho (IBM); Alberto Costa Nogueira Junior (IBM)
AI technology and neuroscience have progressed such that it’s again prudent to look to the brain as a model for AI. Examining current artificial neural networks, theoretical computer science, and systems neuroscience, this workshop will uncover gaps in knowledge about the brain and models of intelligence.
Bernard Baars modeled the brain’s cognitive processes as a Global Workspace. This was elaborated in network neuroscience as the Global Neuronal Workspace, and in theoretical computer science as the Conscious Turing Machine (CTM) [1]. The CTM is a substrate independent model for consciousness. AI researchers have proposed variations and extensions of the Global Workspace, connecting the CTM to Transformers [2] and using them to communicate among specialist modules [3].
Meanwhile, neuroscience has identified large-scale brain circuits brain that bear striking resemblance to patterns found in contemporary AI architectures such as Transformers. This workshop will aim to map the Global Workspace and CTM to AI systems using the brain’s architecture as a guide. We hypothesize that this approach can achieve general intelligence and that high resolution recordings from the brain can be used to validate its models.
The goal of this workshop is to bring together a multi-disciplinary group comprising AI researchers, systems neuroscientists, algorithmic information theorists, and physicists to understand gaps in this larger agenda and to determine what’s known about what’s needed to build thinking machines.
Mark Wegman (IBM); James Kozloski (IBM)
We invite papers that describe innovative use of AI technology or techniques in election processes. The workshop is intended to provide a forum for discussing new approaches and challenges in building AI that people trust and use for critical applications that power society – conducting elections, and for exchanging ideas about how to move the area forward.
Artificial Intelligence and machine learning have transformed modern society. It also impacts how elections are conducted in democracies, with mixed outcomes. For example, digital marketing campaigns have enabled candidates to connect with voters at scale and communicate remotely during COVID-19, but there remains widespread concern about the spread of election disinformation as the result of AI-enabled bots and aggressive strategies.
In response, we conducted the first workshop at Neurips 2021 to examine the challenges of credible elections globally in an academic setting with apolitical discussion of significant issues. The speakers, panels and reviewed papers discussed current and best practices in holding elections, tools available for candidates and the experience of voters. They highlighted gaps and experience regarding AI-based interventions and methodologies. To ground the discussion, the invited speakers and panelists were drawn from three International geographies: US – representing one of the world’s oldest democracies; India – representing the largest democracy in the world; and Estonia – representing a country using digital technologies extensively during elections and as a facet of daily life. The workshop had contributions on all technological and methodological aspects of elections and voting.
Biplav Srivastava; Anita Nikolich; Andrea Hickerson; Chris Dawes; Tarmo Koppel; Sachindra Joshi (IBM)