I provide various sample code and other goodies below. Not all code is in production-ready format and may require small environmental tweaks or changing of certain hard-coded values to run, however scripts on git should be cleaner.
My GitHub
ash-parser: SyntaxNet in pure Python with GPU support
nn-nlp-skeleton: Foundational code for NLP and Neural Network training tasks
nmt: Add GPU support to encoder portion of TensorFlow NMT sample code
mt7610u-linksys-ae6000-wifi-fixes: Update of MT7610U driver for modern Linux kernels
NLP-Related
Fix broken Korean filenames (e.g., Áö´ÉÇüÇüżҺм®±â.zip -> 지능형형태소분석기.zip)
Auto-detect different Korean encodings in current directory (UTF-8, CP949, EUC-KR)
Visualize Gensim word2vec model in TensorBoard
Compress large JSON text streams into one indexed zip file for efficiency
Separate large JSON text streams into HDF5 format
Separate large pure text stream into HDF5 format
Preprocessing, normalizing, and tokenizing dirty Unicode input text
Align TED English and Korean xml corpuses (requires preprocessing/normalizing/tokenizing code)
Convert Sejong POS-tagged corpus format to CoNLL-U format
Train word embeddings from CoNLL corpus file
Convert huge SQL files to CSV files (experimental)
Miscellaneous
Disable light on ASUS ROG STRIX IMPACT mouse under Linux