Pangeanic
Pangeanic – PangeaMT.com is an on-premises and cloud-based platform for AI-enhanced Natural Language Processing solutions: Neural Machine Translation, intelligent data extraction, anonymization and pseudonymization, summarization, etc. Pangeanic – PangeaMT has had great success working with clients that require high speed, high quality translation of corporate content in the legal, financial and technology fields. Government offices in Spain and the EU as well as in the US are among its top clients.
Working with some of the largest and most successful legal and cognitive companies in Europe, Nasdaq-listed companies and several multinationals, Pangeanic – PangeaMT solutions help their clients reach international markets quicker than their competitors, creating new revenue opportunities and expanding their global presence and brand.
The Pangeanic Data repository has been the basis of other EU projects, such as the recent Neural Translation for the EU (NTEU) and will also help Europeana Translate to become a truly multilingual experience. The Pangeanic Data Library has accumulated nearly 11Bn sentences as clean training sets in 85 languages as per early 2021 and 22,3Bn monolingual sentences for language modelling.
CNRS
Founded in 1939, the Centre National de la Recherche Scientifique (National Center for Scientific Research) is a government-funded research organization under the administrative authority of France's Ministry of Research. CNRS research units are spread out throughout France, and employ a large body of permanent researchers, engineers, technicians, and administrative staff.
The CNRS annual budget represents one-quarter of the French public spending on civilian research. CNRS is organized in 1,211 laboratories, either intramural or in partnerships with universities, other research organizations or industry. As one of the largest fundamental research organizations in Europe, CNRS is involved in all scientific fields and is largely involved in national, European and international projects. Interdisciplinary programs and actions offer a gateway into new domains of scientific investigation and enable CNRS to address the needs of society and industry.
The present project will be managed by the Ile de France Sud Regional Office of CNRS (DR4) which is located on the Paris-Saclay Campus, an exceptional scientific and economic area, accommodating a large number of scientific establishments and institutions including public research organizations (CNRS, INRA, ONERA, Inria, CEA…), higher education establishments (École Polytechnique, Université Paris-Sud, Institut d’Optique Graduate School, CentraleSupelec, ENS Paris-Saclay, …) and private research centers (Air Liquide, Danone, Thales R&T, etc).
In the MAPA Project, CNRS represents the LIMSI-CNRS laboratory, UPR3251, directed by François Yvon. The LIMSI laboratory (“Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur”) is a public research laboratory operated by CNRS in association with Université Paris-Sud, located in Orsay (south of Paris, France) and hosts both researchers and teaching staff. LIMSI-CNRS is France's largest research laboratory working on language technologies (LTs), with approximately 80 staff members (both permanent and non-permanent researchers) working in this area.
ELDA
ELDA (the Evaluations and Language Resources Distribution Agency) was created in 1995 as the organisational infrastructure with the mission of providing a central clearing house for Language Resources (LR) of the European Language Resources Association (ELRA – http://www.elra.info). ELDA was set up to identify, classify, collect, validate and distribute the language resources that are needed by the Human Language Technology (HLT) community. For the past few years, ELDA has been involved in the identification, curation, and sharing of a large number of key resources for Machine Translation activities (development and evaluation of systems and applications within European and national evaluation projects).
In addition to work on data production, processing and annotation, validation and quality control, most of these projects also involved work on legal framework management for the produced resources, covering clearing IPR aspects as well as working on licensing issues. ELDA is currently leading the IPR clearance activities in the framework of a number of CEF-financed initiatives collecting, producing and processing data for the improvement of the eTranslation platform.
Furthermore, ELDA coordinates the work on anonymisation within such CEF initiatives regarding data which is under consideration and/or preparation for the EC or requests from stakeholders about intended data donations. This implies defining best practices for data management and supporting users in their concerns when handling potentially sensitive data. The in-house legal team is also part of this activity. Having thus the expertise on data management and sharing activities which are often disrupted by anonymisation issues, ELDA believes that MAPA will provide the technology needed to alleviate the problem and that ELDA can contribute with the necessary data collection, preparation, annotation and IPR clearing, as well as with technology assessment and benchmarking.
Some further key services around ELDA that are proof of its operational capacity in the wide range of Language Resources:
- ELDA manages the largest catalogue of LRs (http://catalog.elra.info/search.php).
- ELDA co-organizes the most important conference on Language Resources and Evaluation (http://www.lrec-conf.org).
- ELDA was a partner of FlaReNet and META-NET that contributed to charting the field (players, inventories, Language Resources cataloguing, international collaboration).
- ELDA manages the LRE-Map (http://www.resourcebook.eu/) to monitor the use and creation of language resources with over 4,000 resources registered to date.
University of Malta
The University of Malta *UM) traces its origins to the founding of the Collegium Melitense by the Jesuits in 1592. It is the highest teaching institution of the State by which it is mainly financed.
The University strives to create courses which are relevant andtimely in response to national, regional and international needs. There are over 10,000 students with over 750 foreign students from nearly 80 different countries, following full- or part-time courses. The University has been involved as coordinator and partner in numerous EU-funded projects under various Programmes including FP5/6/7, Lifelong Learning Programme, Culture 2000, Tempus, H2020 and various other international and regional programmes. The University is also represented in various European and international University networks and groups.
The UM enjoys a highly inter-disciplinary research culture, with small, focussed departments and institutes engaging in extensive collaboration.
The University of Malta has applied for the implementation of the Human Resources Strategy for Researchers (HRS4R) consisting of a Charter and Code, and is presently looking into the forty aspects that combine the Charter and Code.
In January 2018, a committee to oversee the process has been set up which consists of seven members, namely the Pro-Rector for Research and Knowledge Transfer, the Director of Research Support Services, the Director for Corporate Research and Knowledge Transfer, the Director and Deputy Director for Human Resources Management and Development, the Director of Finance, the Deputy Director for Externally Funded Projects, and the HR Manager for Project Support. A working group has also been set up to implement the process, which consists of the Deputy Director for Externally Funded Projects, the Deputy Director for Human Resources Management and Development, and the HR Manager for Project Support.
An initial analysis of the forty aspects has been carried out by the working group, which included a thorough analysis of what policies, procedures and practices are in place and in line with the Charter and Code. Seventeen meetings with management departments directly or indirectly responsible for researchers’ HR-issues have been held in this regard. Presently, the committee is looking into the initial results of each of the forty aspects.
Seven committee meetings/workshops have been held so far, which covered the ethical and professional aspects, the recruitment aspects, as well as the working conditions, social security, and training aspects. In addition, a questionnaire will be distributed later on this year to all stakeholders including researchers ranging from R1 to R4 to examine their perception of the aspects mentioned above.
Tilde
Tilde is a leading European language technology company, specialising in language technologies. To enable languages in the digital age, as a language technology innovator, Tilde provides custom machine translation systems, online terminology and knowledge management services, mobile applications, intelligent virtual assistants, speech processing (analysis and generation) solutions, and proofing tools. Tilde has specific expertise and competence in developing high-demand cloud-based and desktop language technology solutions for complex, highly inflected languages, particularly smaller European languages. Tilde offers language resources for big unstructured multilingual data processing, e.g. speech resources for speech processing, text resources for machine translation, terminology resources for semantic and multilingual annotation of digital content, as well as platforms for secure management and application of language resources in machine learning and artificial intelligence driven technologies.
Tilde maintains an extensive multilingual data repository, the Tilde Data Library, which includes 12.35 billion parallel sentences and 23.85 billion monolingual sentences in 124 languages, as well as over 4 million authoritative terms and over 20 million automatically extracted terms in over 25 languages. In addition, Tilde collected and processed language assets for the under-resourced Baltic languages to develop large-scale machine translation services for the Latvian public sector (Hugo.lv) and the Lithuanian public sector (Versti.eu).
Tilde is one of the top 100 LSPs (Language Service Providers) worldwide in offering translation services. Through years of experience in providing customized translation and localization services, Tilde has developed exceptional expertise across multiple industries, including IT, telecommunications, automotive, pharmacy and life sciences, banking, finance and consumer goods.
Tilde excellence in artificial intelligence driven language technologies is showcased by its success in neural machine translation development and deployment for its customers. In 2017 and 2018, Tilde participated in the machine translation developer competition (the WMT shared task on news translation) and developed the best-performing machine translation systems for ENG-LAV- ENG (2017) and ENG-EST-ENG (2018). In 2018 Tilde neural machine translation systems are available to all users of its cloud-based machine translation platform Tilde MT. In 2017, Tilde neural machine translation systems have been supporting translation efforts for the EU Council presidency in Estonia (2017) and Bulgaria (2018), Austria (2018). Romania (2019) and will continue supporting the EU Council presidency in Finland (2019) and Croatia in the first half of 2020.
Tilde is a full member of the Big Data Value Association (BDVA), elected to the board of directors and EU partnership board and leading the SME group. Tilde also serves on the board of the European Language Resource Association (ELRA), European language technology industry association (LT-Innovate), and Multilingual Europe Technology Alliance (META-NET).
Tilde has a long track record of large-scale EU projects, both coordinating and participating as a partner, including Framework Programmes 5th, 6th, and 7th, as well as H2020 and ICT PSP, eContent, EUROSTARS and CEF Telecom. Tilde is a long-time technology partner of the EC and is currently helping to develop its large-scale automated translation infrastructure Connecting Europe Facility (CEF) eTranslation.
Tilde has 130 employees including 7 PhDs advancing research in language technology and artificial intelligence. Tilde award-winning technologies are serving large user base from corporates and governments to SMEs and individuals.
Tilde is Consortium member of The European Language Resource Coordination (ELRC) actions and will serve as liaison between activities carried out in this project and ELRC activities. ELRC consortium has set up a permanent Language Resource Coordination mechanism to feed the CEF eTranslation DSI with relevant language resources in all official languages of the EU and CEF Associated countries, in order to improve the quality, coverage and performance of automated translation systems and solutions in the context of current and future CEF digital services.
BSC
Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) is the national supercomputing centre in Spain. We specialise in high performance computing (HPC) and manage MareNostrum, one of the most powerful supercomputers in Europe, located in the Torre Girona chapel.
BSC is at the service of the international scientific community and of industry that requires HPC resources. Our multidisciplinary research team and our computational facilities –including MareNostrum– make BSC an international centre of excellence in e-Science.
Since its establishment in 2005, BSC has developed an active role in fostering HPC in Spain and Europe as an essential tool for international competitiveness in science and engineering. The centre manages the Red Española de Supercomputación (RES), and is a hosting member of the Partnership for Advanced Computing in Europe (PRACE) initiative. We actively participate in the main European HPC initiatives, in close cooperation with other European supercomputing centres.
With a total staff of more than 725 R&D experts and professionals, BSC has been successful in attracting talent, and our research focuses on four fields: Computer Sciences, Life Sciences, Earth Sciences and Computer Applications in Science and Engineering.
Most of BSC’s research lines are developed within the framework of European Union research funding programmes, and the centre also does basic and applied research in collaboration with leading companies. The quality of our investigation was recognized by the Spanish government with the Severo Ochoa Excellence Centre grant for cutting edge Spanish science.
Vicomtech
Vicomtech is an applied research centre specialising in Advanced Interaction technologies, Computer Vision, Data Analytics, Computer Graphics and Language Technologies.
It belongs to Graphicsmedia.net, a strategic alliance of international applied research centres working in computer graphics and multimedia technologies.
Vicomtech aims to respond to the innovation requirements of companies and institutions. To do this, it conducts applied research and develops multimedia visual interaction and communications technologies.
- Complements and closely collaborates with industry, universities and other technology centres.
- Promotes mobility and training for its researchers.
This command of knowledge and technologies, either directly or through the network, provides value to clients, as Vicomtech:
- Suitably responds to its clients’ needs.
- Enables companies to make the most of the opportunities available to them.
- Proposes product improvement or developments based on state-of-the-art science and technology knowledge.
Vicomtech’s Board of Trustees is currently made up of relevant representative members within its areas of action.
Vicomtech achieved the HR Excellence in Research recognition of the European Commission, which proves its compromise with open, transparent and merit-based recruitment of researchers (OTM-R: Open, Transparent and Merit-based RECRUITMENT OF Researchers). This acknowledgement adds up to ISO9001_2015 and UNE166002:2014, which means another step forward towards international excellency in advanced management in R&D&I, an aspect of the DNA of the centre since the beginning.
All its activities are covered by the innovation management system – continuous improvement and measurement of results, technology innovation process optimisation and knowledge transfer and generation – ensuring it uses the highest quality methods.
Vicomtech’s research profile makes it a bridge between the local and international spheres, and this applied research model places new opportunities for accessing a global business environment within reach of local companies, allowing them to benefit from the latest international advances. Vicomtech’s participation in international projects complements and boosts its main work on local projects.
Additionally, Vicomtech contributes to knowledge by training new researchers and disseminating its results through world-renowned publications and conferences.