Speech processing researcher

Grupo FalaBrasil

Cassio T Batista

This is a legacy website. Please refer to @cassiotbatista instead.

I have a PhD degree in Computer Science (2023) conferred by Federal University of Pará (UFPA) in Belém, Brazil. Currently, I am doing research in speech processing at Vivoka in Metz, France. My professional experience includes mostly speech recognition and machine learning.


  • Speech processing


  • PhD in Computer Science, 2023

    Federal University of Pará

  • MSc in Computer Science, 2017

    Federal University of Pará

  • BSc in Computer Engineering, 2016

    Federal University of Pará


What am I (supposed to be) good at?

Speech Recognition

Kaldi, Icefall (K2), SpeechBrain, etc.

Linux & Tools

Arch, XMonad, Vim, Git, Python, C, etc.

Machine Learning

PyTorch, Scikit-learn, ONNX, etc.



Speech processing research


Jun 2023 – Present Metz, France
Speech recognition

Speech processing research


Mar 2021 – May 2023 Campinas, Brazil

Speech-based technologies:

  • Lattice rescoring via n-grams and neural networks LM
  • ASR + VAD and SER (emotion)

PhD in Computer Science

Federal University of Pará (UFPA)

Dec 2017 – Oct 2022 Belém, Brazil

Speech-based technologies:

  • Kaldi ASR for Brazilian Portuguese
  • Utterance copy TTS in English using Klatt and deep learning techniques

MSc in Computer Science

Federal University of Pará (UFPA)

Mar 2017 – Dec 2017 Belém, Brazil

A universal remote control system in C++ for people with upper-limb motor disabilities, so they could control a TV via alternative methods.

  • OpenCV for head gesture recognition
  • PocketSphinx for speech recognition
  • Adaptive switches in hardware

Research Internship


Mar 2016 – Dec 2016 Belém, Brazil
A simulator in Python for the routing and wavelength assignment (RWA) problem over transparent, wavelength-multiplexed optical networks using Genetic Algorithms.

Summer Internship

Óbuda University (OE)

Mar 2014 – Jan 2015 Budapest, Hungary

Development of speech (English) modules for controlling Teki: a personal home assistant, Turtlebot-based robot

  • PocketSphinx desktop on Linux + ROS (offline)
  • Android’s Google ASR (online Wi-Fi UDP connection)

Research Internship

Federal University of Pará (UFPA)

Jan 2012 – Feb 2016 Belém, Brazil

Development of resources and applications for spech recognition in Brazilian Portuguese:

  • PyQt4 CFG/BNF grammar tester for Julius
  • Acoustic model training on CMU Sphinx for KDE Simon Listens
  • Android client + Julius server vs. Google’s Android ASR

Recent Posts

Free Online Courses

I decided to post some comments about some excellent online courses related to computing and engineering that I’ve started to take during Covid-19 self-quarantine. Literally all of them are available for free on YouTube.


Head Remote

A system where user’s head gestures are translated into remote commands to electronic devices

Speech Remote

A remote control system that translates the user’s spoken words into commands to electronic devices


Speech recognition and TV remote control using Android and BeagleBone Black

Recent Publications

Towards a Free, Forced Phonetic Aligner for Brazilian Portuguese Using Kaldi Tools

Forced phonetic alignment in Brazilian Portuguese using Kaldi tools.

A Parallel Strategy for a Genetic Algorithm in Routing Wavelength Assignment Problem Using GPU with CUDA

Routing and wavelength assingment simulador on NVIDIA CUDA GPUs.

Evaluating Alternative Interfaces Based on Puff, Electromyography and Dwell Time for Mouse Clicking

Statistical comparison among three different types of mouse click: mouth puffing, EMG and dwell-time. Two out of these three methods have been developed in hardware and their schematics been open-sourced.

Utterance Copy in Formant-based Speech Synthesizers Using LSTM Neural Networks

Estimating the input parameter of Klatt88 formant-based speech synthesizer with long short-term memory neural nets (LSTM).


Feel free to send an email to cassiotbatista@gmail.com