Last news

September 3, 2026
in Workshops, GA4GH, SwissGA4GH

SwissGA4GH Workshop: Interoperability for genomic variants, practical use of GA4GH standards

Date: 3 September 2026
Time: 09:30 - 15:00
Location: Campus Biotech, Geneva, Switzerland
Room: H4.02-A

This SwissGA4GH workshop will focus on the practical use of GA4GH standards to improve interoperability in genomic variant representation and federated discovery.

The session will address challenges arising from heterogeneous variant data across clinical, cohort, research, literature-derived and public resources. It will examine how standards such as VRS, Categorical VRS and Beacon can support consistent modelling, matching and discovery of variants across distributed infrastructures.

Through presentations and discussions, participants will explore practical examples of standard adoption, implementation challenges and federated discovery workflows. The workshop will conclude with a discussion on current barriers, open questions and future directions.

No prior GA4GH experience is required, the workshop is open to all interested participants.

Registration (free): register

For any questions about the workshop, please contact: anais.mottaz@hesge.ch

Speakers

Michael Baudis
Worawich Phornsiricharoenphant
Anaïs Mottaz
Patrick Ruch
Venkata Satagopam
Hinri Kerstens (in person or remote, to be confirmed)
Emidio Capriotti (tentative)

Schedule (tentative)

Time	Session	Description
09:30-09:45	Introduction and context	Scope of the workshop and interoperability challenges in genomic variant data
09:45-10:30	GA4GH standards for variant representation	Introduction to GA4GH standards for variant representation, with a focus on VRS and Categorical VRS
10:30-11:30	Practical perspectives on variant representation	Examples of VRS use in national genomic platforms, public variant resources such as ClinVar, literature-derived variant standardisation efforts and challenges, and VRS-Python
11:30-12:00	From observed variants to interpretation and reporting	Clinical bioinformatics perspectives on connecting observed variants to interpretation and reporting workflows
12:00-13:30	Lunch break
13:30-14:15	Federated discovery	Introduction to the GA4GH Discovery Work Stream and the Beacon protocol, including use for federated discovery across distributed resources
14:15-14:45	Implementation perspectives	Implementation experiences from deploying GA4GH standards in practice
14:45-15:00	Wrap-up and open discussion	Key messages, open questions and future directions

June 14, 2026
in Conferences, World Biodiversity Forum

WBF2026 - Workshop : « Biodiversity Evidence: Foundations of a Community of Practice to streamline and innovate literature workflows »

We are pleased to announce our participation in the CON20 workshop, “Biodiversity Evidence: Foundations of a Community of Practice to streamline and innovate literature workflows” at the World Biodiversity Forum 2026 (Davos, 14-19 June 2026).

The workshop addresses the growing challenge of navigating an expanding body of biodiversity literature, including both peer-reviewed publications and grey literature. It will bring together invited participants to review existing tools and approaches for literature identification, screening, and preparation for synthesis, and to discuss current gaps and needs.

Through short presentations and breakout discussions, participants will work toward establishing the foundations of a Community of Practice (CoPLit) aimed at improving and coordinating literature workflows. The session is part of a broader series at WBF2026 focused on biodiversity evidence.

Registration form 📝: https://tinyurl.com/WBF-CON20

Details 🔎: https://meetingorganizer.copernicus.org/WBF2026/session/55598

January 1, 2026

Scalable Biodiversity Monitoring by Broadening Article Acquisition

For over a decade, the Swiss Institute of Bioinformatics and HES-SO/HEG Geneva have maintained the SIB Literature Services (SIBiLS, Gobeill et al. 2020). SIBiLS is also an ELIXIR Data Resource, supported by the Data Platform commissioned services.

In 2023, SIBiLS received a significant upgrade with the launch of the Biodiversity PMC digital library. SIBiLS is indeed operating the backend of Biodiversity PMC, which has emerged as the main user portal of SIBiLS. Biodiversity PMC is indeed becoming a global resource in the field: Biodiversity PMC is the largest digitally-native repository of articles for biodiversity research and related disciplines; thus delivering a broad coverage “One Health” library with a broader biodiversity-related coverage than PubMed of EuropePMC.

ScaleBioM aims to enhance Biodiversity PMC to : 1) broaden the coverage by harvesting contents directly from OpenAlex, 2. maintain topical consistency by using automatic filtering methods to harvest only biodiversity related contents from OpenAlex.

We will evaluate how these services can help ecologists - and in particular our partner, Prof. Clara Zemp, from the University of Neuchâtel - to monitor biodiversity on island ecosystems. The development of the automatic filtering services will also benefit from the support from assessments performed by the IPBES.

November 17, 2025
in ELIXIR

ELIXIR Platforms Co-Located Event, Autumn 2025

SIB will participate in the ELIXIR Platforms Co-Located Event, held from 17 to 20 November 2025 in Marseille, France. The event brings together three core ELIXIR platforms (Interoperability, Data, and Tools) for several days of coordinated meetings and cross-platform discussions.

The meeting aims to support alignment across platforms, address shared technical and strategic topics, and facilitate collaboration among ELIXIR nodes. Bringing the platforms together in a single event creates space for coordinated work on topics such as data interoperability standards, cross-platform service integration, and the consolidation of workflows and registries that operate across several ELIXIR resources.

🧭 Event page: https://elixir-europe.org/events/elixir-platforms-co-located-event-autumn-2025

🔗 ELIXIR Interoperability Platform: https://elixir-europe.org/platforms/interoperability

📊 ELIXIR Data Platform: https://elixir-europe.org/platforms/data

🛠️ ELIXIR Tools Platform: https://elixir-europe.org/platforms/tools

November 13, 2025
in Conferences, Langue française

Réseau Opale - colloque : « Quelles ressources pour dire le monde contemporain ? »

Le 13 novembre 2025, la Délégation générale à la langue française et aux langues de France organise, à la Fondation Simone et Cino Del Duca (Institut de France, Paris), le colloque « Quelles ressources pour dire le monde contemporain ? », dans le cadre des rencontres du réseau OPALE (Organismes francophones de politique et d’aménagement linguistiques).

Cet événement réunira des chercheurs, décideurs et acteurs institutionnels autour d’une question centrale : comment préserver et valoriser la diversité linguistique et terminologique à l’ère de l’intelligence artificielle.

Le Pr Patrick Ruch participera à la table ronde « Quels nouveaux espaces pour dire le monde ? », où seront abordés les enjeux de la découvrabilité des contenus francophones et le rôle des infrastructures ouvertes et des ressources linguistiques dans l’écosystème de la recherche et de l’intelligence artificielle. L’occasion sera de réfléchir à la manière de favoriser le partage des connaissances et la coopération scientifique au sein de la francophonie, en développant des ressources accessibles et collaboratives qui permettent au français de conserver une présence active et durable dans le numérique et les technologies émergentes.

🔗 Programme et informations : https://www.culture.gouv.fr/thematiques/langue-francaise-et-langues-de-france/agir-pour-les-langues/quelles-ressources-pour-dire-le-monde-contemporain-colloque-a-paris

🤝 Réseau Opale : https://www.reseau-opale.org/

November 11, 2025
in Biodiversity

DEST training in Berlin: Biodiversity Data Reuse

This week, the SIB Text-mining group participated in the Distributed European School of Taxonomy (DEST) in Berlin, delivering a specialized session on processes of annotating and sharing biodiversity data. The lecture focused on the complementary reuse of data extracted from publications, highlighting practical tools and workflows for researchers and institutions.

The training featured hands-on exercises using BiodiversityPMC, allowing participants to test specific annotation and question-answering workflows on their own use cases. They explored front-end interfaces, accessed data through SIBiLS collections, back-end services, and APIs, and experimented with curation-support tools and advanced triage systems. The session provided a practical setting for researchers to engage directly with biodiversity data, enhancing their skills in discovering, extracting, and reusing information from scientific publications.

🦋 About DEST: Discover the initiative supporting taxonomic training and capacity building across Europe.

📚 Explore BiodiversityPMC: Learn more about the platform and its functionalities for accessing and reusing biodiversity data.

November 11, 2025
in FHDportal

ELIXIR FHD Community Day & HDTR Project Meeting

The SIB Text-Mining group will take part in the ELIXIR FHD Community Day and HDTR Project Meeting on 11 November 2025, presenting progress on the metadata assignment and search services developed within FHDportal.

These services aim to enhance the description and findability of deposited human data by using AI-assisted metadata assignment based on established vocabularies. We are developing a multi-class, multi-level classifier for MeSH terms, designed to handle the high-dimensional and sparse nature of these labels, and trained on a dataset of ~9 million annotated examples (supplementary data, from SIBiLS).

However, scaling this approach comes with huge challenges: the high dimensionality, sparsity, and label imbalance of MeSH make training and evaluation demanding. Ensuring fair and reliable descriptors' assignment requires careful model design, including sparse-aware loss functions, sampling strategies or exploration of generative modeling approaches, all powered by large-scale distributed training, pushing the boundaries of AI for biomedical metadata.

November 5, 2025
in PHD Thesis

EMNLP 2025 in Suzhou: TransBERT Paper Published

We are pleased to announce our participation in EMNLP 2025 in Suzhou, China (Nov. 4–9, 2025) with the publication of the paper “TransBERT: A Framework for Synthetic Translation in Domain-Specific Language Modeling.”

This work presents TransBERT, a framework for pre-training language models using synthetically translated text to address the limited availability of domain-specific data in non-English languages. The TransCorpus toolkit covering French, German, Spanish, and Hindi and the TransBERT-bio-fr model are available for you to explore and try.

☕ Julien Knafou is on site and happy to connect for a chat or coffee!

🤗 Model: https://huggingface.co/jknafou/TransBERT-bio-fr

🖥️ Toolkit: https://github.com/jknafou/TransCorpus

📚 Datasets: https://huggingface.co/jknafou/datasets?search=transcorpus

June 30, 2025

Server migration from July 28th to 30th

The servers will be relocated from the current server room to a new one dedicated to us. In consequence, our services will be offline from July 28th to 30th. We are sorry for the inconvenience.

January 1, 2025
in ELIXIR Implementation Studies on "Human data and translational research (HDTR)", Joint Project

FHDportal

Theme: Data Deposition

Human data, especially genomic data, is increasingly being federated across borders and institutions, with many stakeholders participating in multinational and global biomedical and health data networks, fostering collaborations and partnerships. While such international efforts are essential for the compilation and reuse of data, regulatory constraints often hinder the movement of certain data beyond organisational or national boundaries. Centralised approaches such as the Central European Genome-Phenome Archive (CEGA) are valuable, but not all data can be centralised.

The Federated European Genome-phenome Archive network (FEGA) addresses this, with early work concentrated on local collection of data with central archiving of metadata. FHDportal aims to support both federated and central submission of metadata. It will do this by providing a reusable portal for gathering and storing metadata at a national level, and submitting required metadata centrally to enable discovery of datasets via the CEGA. FHDportal complements the existing system by providing a way to explore richer metadata (for example, including detailed information on specific datasets or local funding information), while enabling a core set of metadata to be queried centrally.

FHDportal will be deployed and tested on FEGA nodes, and should be of interest to the many other countries seeking to join FEGA. The need for FHDportal is based on experience during onboarding and in moving to production nodes. It will offer a common solution for local mobilisation of data and metadata, which can be adapted to local situations. During development, it will be tested on both new and well-established nodes using different technical platforms and infrastructures. The resulting software will be provided to the whole community, and will hopefully become part of the emerging toolkit for new FEGA nodes wishing to establish themselves, and to ensure their nodes meet local needs while bringing European scale benefits.

The SIB Text Mining will develop a service powered by a dedicated language model to support the semi-automatic assignment of descriptors at deposition time. The service aims at facilitating the provision of meta-data by end-users as explored in Teodoro et al. 2017.

Nodes involved: ELIXIR Switzerland, ELIXIR Finland, ELIXIR Luxembourg, ELIXIR UK

Communities: Federated Human Data, Human Copy Number Variation