Robust recipes to align language models with human and AI preferences
A Python + iCloud wrapper to access iPhone and Calendar data.
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting has…
Jina examples and demos to help you get started
☁️ Build multimodal AI applications with cloud-native stack
An open-registry for hosting Jina executors via container images
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
This repo supports various cross-lingual transfer learning & multilingual NLP models.
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
A word2vec negative sampling implementation with correct CBOW update.
Acceptance rates for the major AI conferences
LibKGE - A knowledge graph embedding library for reproducible research
Getting interpretable dimensions in word embedding spaces.
Analyzing mBERT's multilinguality in a small laboratory setting
A list of selected resources, methods, and tools dedicated to Legal Text Analytics.
Helper to create posts for Bayern Ticket Mitfahrer groups in Facebook.
Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper
Unsupervised text tokenizer focused on computational efficiency
Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
Papers & presentation materials from Hugging Face's internal science day
A Python framework for creating, editing, and invoking Noisy Intermediate-Scale Quantum (NISQ) circuits.
Language-Agnostic SEntence Representations
A framework to learn cross-lingual word embedding mappings
👓 A web interface of gpustat: monitor GPU clusters at a look