The BiTeM Group, headed by Patrick Ruch, is part of the Information Science Department of the HEG (University of Applied Sciences, Geneva). It gathers a network of researchers (computer scientists, biologists, bioinformaticians, MDs...) affiliated to various research institutions in Geneva. More information about BiTeM can be found on the SIBTex web pages: http://www.isb-sib.ch/groups/geneva/stm-ruch.html. SIBTex, the Text Mining group of the SIB Swiss Institute of Bioinformatics, gathers BiTeM's infrastructure services for biologists and biocurators.
The BiTeM group is involved in several research projects, with a strong focus on clinical and biological data. The main research areas developed are:
1. Text Mining: sometimes alternately referred to as text data mining, roughly equivalent to text analytics, refers generally to the process of deriving high-quality information from textual contents. High-quality information is typically derived through the dividing of patterns and trends through means such as association or pattern learning. Knowledge intensive resources such as dictionaries, terminologies, ontologies and manually crafted rules play an important role in the domain. Text mining usually involves the process of structuring the unstructured or semi-structured input text to generate a more structured (or enriched) database. A typical text mining task includes (e.g. question-answering): information retrieval, named-entity recognition and information extraction. Quality in text mining usually refers to some combination of relevance, novelty, and interestingness. Other common tasks include text categorization (filtering, descriptor assignment...), sentiment analysis, document summarization, and entity relation modeling (i.e., extraction of protein-protein interactions).
2. Bibliomics: Bibliomics is the bioinformatics study of bibliome. The bibliome is the totality of biological text corpus. It denotes the importance of biological text contents for biomedical sciences. In practice, bibliomics is often regarded as the application of textual data mining to literature in molecular biology and to MEDLINE in particular. However, the notion tend to expand beyond literature to various other contents, such as the web, patent documents or clinical reports. From the bibliome, biologists and computer scientists datamine to discover new gene targets and drugs.