There are thousands of ethnic groups in the world contributing to the rich linguistic and cultural diversity of people. However, in digital resources and research, the majority of these languages, including more than 30 ethnic languages spoken in Bangladesh remain severely underrepresented. There is little to no work addressing the preservation, translation, or computational processing of these languages, despite their unique linguistic structures and speaker population. In order to highlight the difficulties faced by low-resource and endangered languages worldwide, this dataset focuses on the three leading ethnic languages, Chakma, Garo, and Marma along with their corresponding Bengali and English translations. People from different ethnic groups use Bengali alphabets to write their own language on social media platforms like Facebook and Twitter, as well as in their daily lives. Due to significant linguistic variances, even when ethnic native speakers use the Bengali script to write their languages, the resulting text is unintelligible to Standard Bengali speakers. Moreover, the lack of translation systems and language identification tools indicate the digital exclusion of these communities. This dataset addresses these gaps by documenting sentence-level linguistic samples in Chakma, Garo, and Marma through transliteration, where the phonetics of each language are represented using Bengali script. It also provides meaning-based translations in both Standard Bengali and English, rather than literal word-for-word mappings, to preserve the intended meaning of the original sentences. By documenting linguistic samples in Chakma, Garo, and Marma through a transliteration process, this dataset is a critical resource for advancing Natural Language Processing (NLP) and cultural preservation for worldwide low-resource ethnic languages.