Language + Molecules
@ ACL 2024 Workshop
August 12–17, 2024 hybrid in Bangkok, Thailand & Remote
Synthesizing Language and Molecules for Scientific Insight and Discovery
Welcome to the Language + Molecules Workshop! Join us as we explore the integration of molecules and natural language, with exciting applications such as developing new drugs, materials, and chemical processes. These molecular solutions will be critical to address global problems on scales of complexity never-before-seen, in areas such as climate change and healthcare. However, they exist in extremely large search spaces, which makes AI tools a necessity. Excitingly, the chemistry field is posed to be substantially accelerated via multimodal models combining language with molecules and drug structures.
Stay tuned by following us on Twitter @lang_plus_mols.
A natural question to ask is why we want to integrate natural language with molecules. Combining these types of information has the possibility to accelerate scientific discovery: imagine a future where a doctor can write a few sentences describing a patient’s symptoms and then receive exact structure of the drugs necessary to treat that patient’s ailment (taking into account the patient’s genotype, phenotype, and medical history). Or, imagine a world where a researcher can specify the function they want a molecule to perform (e.g., antimalarial or a photovoltaic) rather than its low level properties (e.g., pyridine-containing). This high-level control of molecules requires a method of abstract description, and humans have already developed one for communication: language. Integrating language with scientific modalities has the following major advantages, as discussed in this recent survey, section 10.3.3:
Research in scientific NLP, integrating molecules with natural language, and multimodal AI for science/medicine has experienced significant attention and growth in recent months. We believe now is the time to begin organizing this nascent community. To do so, we propose a new ACL workshop: “Language + Molecules”. Further, to broaden the communities’ understanding of the associated challenges, methodologies, and goals, we will be holding an EACL tutorial. In the workshop’s first year, we will focus on the following research themes:
Submission Instructions
We plan to have both a non-archival proceedings of relevant papers, and a shared task to benchmark the progress of generative text-molecule models. Shared task submissions will be encouraged to submit short papers. All submissions should be in PDF format and made through OpenReview submission portal. Submissions must be anonymized following ACL guidelines, but a preprint policy will not be enforced. Information on submitting shared task predictions can be found at the shared task.
Authors are invited to submit papers between 4 and 8 pages, with unlimited pages for references and appendices. In line with the ACL main conference policy, camera-ready versions of papers will be given one additional page of content. It should follow the ACL template style, which can be found here.
The research presented in these papers should be substantially original. Regardless of their length, all submissions will undergo a single-track review process. All submissions must be anonymous for double-blind review. No author information should be included in the papers, and self-references that identify the authors should be avoided or anonymized. We expect each paper to be reviewed by at least three reviewers. To encourage higher quality submissions, we will offer Best Paper Award(s) based on nomination by the reviewers and extensive discussions among the chairs. Accepted papers will be presented as posters by default, and outstanding submissions will also be selected for oral or spotlight presentations.
According to the ACL workshop guidelines, we do not encourage the re-submission of already-published papers, but you are allowed to submit ArXiv pre-prints or those currently under submission. Moreover, a work that is presented at the *CL main conference should not appear in a workshop. Please be sure to indicate conflicts of interest for all authors on your paper.
All deadlines are 11:59 pm UTC-12h (“Anywhere on Earth”).
Nov 15 2023 | Call for Workshop Papers |
---|---|
Mar 17 2024 | EACL Tutorial on Language + Molecules |
Apr 7 2024 | Early Feedback Request Deadline for Underrepresented Groups |
May 7 2024 | Paper Submission Deadline |
May 7 2024 | Shared Task Submission Deadline |
Jun 6 2024 | Notification of Acceptance |
Jun 20 2024 | Camera-Ready Due |
July 16 2024 | Release of Proceedings |
Aug 12-17 2024 | Workshop Date |
Team | BLEU-2 | BLEU-4 | ROUGE-1 | ROUGE-2 | ROUGE-L | METEOR | Text2Mol |
---|---|---|---|---|---|---|---|
team 1 | xx | xx | xx | xx | xx | xx | xx |
team 2 | xx | xx | xx | xx | xx | xx | xx |
team 3 | xx | xx | xx | xx | xx | xx | xx |
Team | BLEU | Exact | Levenshtein | MACCS FTS | RDK FTS | Morgan FTS | FCD | Text2Mol | Validity |
---|---|---|---|---|---|---|---|---|---|
team 1 | xx | xx | xx | xx | xx | xx | xx | xx | xx |
team 2 | xx | xx | xx | xx | xx | xx | xx | xx | xx |
team 3 | xx | xx | xx | xx | xx | xx | xx | xx | xx |
Time | Program |
---|---|
9:00-9:10 | Opening remarks |
9:10-10:40 | Keynote speeches |
10:40-11:30 | Panel discussion |
11:30-12:30 | Poster session |
12:30-13:30 | Lunch break |
13:30-15:00 | Keynote speeches |
15:00-15:50 | Panel discussion |
15:50-16:50 | Oral paper session (12 min talk + 3 min QA) |
16:50-17:20 | Challenge track spotlight session (6 min talk) |
17:20-17:30 | Closing remarks |
To appear.
Please email language.molecules@gmail.com if you have any questions.
This workshop will be partially supported by the Molecule Maker Lab Institute: an AI research institute program supported by NSF under award No. 2019897 and No. 2034562. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.