While working with liblouis and Braille texts I wrote a little utility to help me convert between various braille "encodings". I hope I'm not off-topic to publish it here.
As an example suppose that you have the Braille ASCII text "HELLO WORLD" and you want to convert it to Unicode Braille. What you need to run is:
# bconv -fa -tu 'HELLO WORLD' ⠓⠑⠇⠇⠕⠀⠺⠕⠗⠇⠙ or like this: # echo 'HELLO WORLD' | bconv -fa -tu ⠓⠑⠇⠇⠕⠀⠺⠕⠗⠇⠙ Generally you can convert from/to these Braille encodings: Braille ASCII (North American/ΜΙΤ Code) Unicode Braille (like ⠉⠙⠑) dots Braille (like p14p145p15 or (14)(145)(15) or similar) pseudo Braille e.g. o o o . o o o . . o o o o . . . . .All the conversions are handled by a python library (pybraille.py) which may also be helpful to those of you working with python.
This code is only tested under Linux with python 2.6.5. The code for the utility is provided under a GPL version 3.0 or later license. The library is under LGPL v3 or later. Be warned that although I've done quite a lot of testing and I don't expect to have any hidden errors I've had zero review from others. If you like this or find it useful and want to give something back I'll be grateful if you test it and report back any bugs you found.
Here is the help page that you get from running the utility without any options:
-------------------------------------------------------- Usage: bconv -f FROM -t TO [other options] [input]Convert from/to various typef of braille text. The supported TYPES OF BRAILLE
are the following: a = Braille ASCII* e.g. CDE (North American/ΜΙΤ Code) u = Unicode Braille e.g. ⠉⠙⠑ d = dots Braille e.g. p14p145p15 p = pseudo Braille e.g. o o o . o o o . . o o o o . . . . . EXAMPLES # bconv -f d -t u p1p12p14p145 ⠁⠃⠉⠙ # bconv -f a -t u ABCD ⠁⠃⠉⠙ # bconv -f u -t d ⠁⠃⠉⠙ p1p12p14p145 # bconv -f u -t a ⠁⠃⠉⠙ ABCD # bconv -s -f u -t p ⠁⠃⠉⠙ o . o . o o o o . . o . . . . o . . . . . . . . # bconv -fu -ta '⠉⠙ lala ⠑' --onerr=p CD lala E # bconv -fu -ta '⠉⠙ lala ⠑' --onerr=r CD??????E # bconv -fu -ta '⠉⠙ lala ⠑' --onerr=e ... ValueError: unibr_2_ascbr failed to convert KNOWN BUGS --onerr p is not always working as expected. e.g.: # bconv -fa -tu FOOfoo --onerr=p # this is OK ⠋⠕⠕foo # bconv -s -fa -tu FOOfoo --onerr=p # this is NOT ⠋⠕⠕&// Options: -h, --help show this help message and exit-f FROMB one character that specifies the braille type of input
(see TYPES OF BRAILLE)-t TOB one character that specifies the braille type for the
output (see TYPES OF BRAILLE) --onerr=ACTION how to handle errors during conversions. Allowed values: p=pass input unchanged(default), e=raise an error, r=replace invalid chars--repl=REPLACE_CHAR the replacement string to use if you specify --onerr r
(default is ?) -s strip dots 7,8 from all cells -------------------------------------------------------- Nick Demou
#!/usr/bin/python # -*- coding: utf-8 -*- # Licensed under the GNU General Public License, version 3. # See the file http://www.gnu.org/copyleft/gpl.txt # version 0.9 Copyright Nick Demou ndemou@xxxxxxxxxx 2012 # todo: # dot-braille handling doesn't support all the options that pybraille # gives me like setting non-default options for all of the following: # prefix delimiter suffix valid_chars import pybraille import optparse import sys def main(): reload(sys).setdefaultencoding('utf8') usage = '''usage: %prog -f FROM -t TO [other options] [input] Convert from/to various typef of braille text. The supported TYPES OF BRAILLE are the following: a = Braille ASCII* e.g. CDE (North American/ÎÎΤ Code) u = Unicode Braille e.g. â â â d = dots Braille e.g. p14p145p15 p = pseudo Braille e.g. o o o . o o o . . o o o o . . . . . EXAMPLES # %prog -f d -t u p1p12p14p145 â â â â # %prog -f a -t u ABCD â â â â # %prog -f u -t d â â â â p1p12p14p145 # %prog -f u -t a â â â â ABCD # %prog -s -f u -t p â â â â o . o . o o o o . . o . . . . o . . . . . . . . # %prog -fu -ta 'â â lala â ' --onerr=p CD lala E # %prog -fu -ta 'â â lala â ' --onerr=r CD??????E # %prog -fu -ta 'â â lala â ' --onerr=e ... ValueError: unibr_2_ascbr failed to convert KNOWN BUGS --onerr p is not always working as expected. e.g.: # %prog -fa -tu FOOfoo --onerr=p # this is OK â â â foo # %prog -s -fa -tu FOOfoo --onerr=p # this is NOT â â â &// ''' parser = optparse.OptionParser(usage) parser.add_option("-f", "", action="store", dest="fromb", help='one character that specifies the braille type of input (see TYPES OF BRAILLE)') parser.add_option("-t", "", action="store", dest="tob", help='one character that specifies the braille type for the output (see TYPES OF BRAILLE)') parser.add_option("--onerr", "", action="store", dest="action", default='p', help='''how to handle errors during conversions. Allowed values: p=pass input unchanged(default), e=raise an error, r=replace invalid chars''') parser.add_option("--repl", "", action="store", dest="replace_char", default='?', help='''the replacement string to use if you specify --onerr r (default is ?)''') parser.add_option("-s", "", action="store_true", dest="sixdot", default=False, help='strip dots 7,8 from all cells') opt, args = parser.parse_args() #print opt,args if args: inp = [ u' '.join(args) ] else: # we'll read and translate stdin inp = sys.stdin pass if opt.action.lower()=='p': pybraille.on_conv_err = pybraille.ON_CONV_ERR_PASS elif opt.action.lower()=='e': pybraille.on_conv_err = pybraille.ON_CONV_ERR_RAISE elif opt.action.lower()=='r': pybraille.on_conv_err = pybraille.ON_CONV_ERR_REPLACE pybraille.on_conv_err_replace_char = opt.replace_char fromb, tob = '','' if opt.fromb: fromb = opt.fromb.lower() if opt.tob: tob = opt.tob.lower() char2word = { 'a':'ascbr', 'u':'unibr', 'd':'dotbr', 'p':'psebr' } if fromb in ['a','u','d'] and tob in ['a','u','d','p']: if tob=='u': conv_func = lambda x: x # function that returns input unaltered else: conv_func = getattr(pybraille, "unibr_2_%s" % char2word[tob]) for line in inp: l=line.decode('utf-8') # decode is needed if inp = sys.stdin # 1st translate to unibr if fromb<>'u': to_unibr_func = getattr(pybraille, "%s_2_unibr" % char2word[fromb]) unibr = to_unibr_func(l.strip('\r\n')) else: unibr = l.strip('\r\n') # then strip 7,8 dots if requested if opt.sixdot: unibr = pybraille.strip_dots78( unibr ) # finaly translate to -t format out = conv_func(unibr) if opt.sixdot and tob=='p': out.print_8dot_brl = False print out else: parser.print_help() exit(-1) if __name__ == "__main__": main()
#!/usr/bin/python # -*- coding: utf-8 -*- # Licensed under the GNU Lesser General Public License, version 3. # See the file http://www.gnu.org/copyleft/lgpl.txt # version 0.9 Copyright Nick Demou ndemou@xxxxxxxxxx 2012 ''' Convenience functions for dealing with braille text Mostly functions to convert between various types of braille represantations: Unicode Braille "â â â " Braille ASCII "LAB" Dots notation(s) "p123 p1 p12" or "(123)(1)(12)" etc Also support for printing pseudo braille like this: >>> print_psebr(unibr_2_psebr('â â â â â â â ')) o . o . o . . o o . o . o . o . o o . . o . o . o . . o . . o . . . . . o . o . . . IMPORTANT SHORTHANDS USED IN THIS MODULE ======================================== ascbr = Braille ASCII -- e.g. " LAB" (North American/ÎÎΤ Braille ASCII Code) unibr = Unicode Braille -- e.g. "â â â â " dotbr = dots Braille -- e.g. "p0p123p1p12", or "(0)(1,2,3)(1)(1,2)" or "0,123,1,12"* psebr = pseudo Braille**-- e.g.: . . o . o . o . . . o . . . o . . . o . . . . . *: this code also supports many other dots-braille schemes that you may come up by compining a prefix, a delimiter and a suffix (all of them optional) e.g. this fancy dot-style: "<1-2-3>" has prefix='<', delimiter='-' and suffix='>' **: see notes about pseudo braille representation in class PsudoBraille UNDERSTANDING THE CODE ====================== To understand the code you'll often need the following graphic as a reference: The 8 dot's of a braille cell ----------------------------- To the left of the dot its numerical order (1,2,3...8 ) Inside the dot its hex value (1,2,4,8,0x10,...,0x80) __ __ 1 / 1\ 4 / 8\ \__/ \__/ __ __ 2 / 2\ 5 /10\ \__/ \__/ __ __ 3 / 4\ 6 /20\ \__/ \__/ __ __ 7 /40\ 8 /80\ \__/ \__/ Example: Take the character with unibr â Its dotbr is p135 (the numbers at the left of the dots) Its unicode position is 0x2800+1+4+0x10 (add the numbers inside plus 0x2800) ''' ON_CONV_ERR_RAISE = 1 ON_CONV_ERR_PASS = 2 ON_CONV_ERR_REPLACE = 3 # overide them if you want: on_conv_err = ON_CONV_ERR_PASS # see above for allowed valuesy on_conv_err_replace_char = '?' # unibr to ascbr translation (based on wikipedia and BrailleUtils) _BrailleAscii__ascii = " A1B'K2L@CIF/MSP\"E3H9O6R^DJG>NTQ,*5<-U8V.%[$+X!&;:4\\0Z7(_?W]#Y)=" _BrailleAscii__unicode = u"â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â ⠡⠢⠣⠤⠥⠦⠧⠨⠩⠪⠫⠬â ⠮⠯⠰⠱⠲⠳⠴⠵⠶⠷⠸⠹⠺⠻⠼⠽⠾⠿" ######################################################################## class PseudoBraille(str): """Just a unicode braille string with the __str__ method overided in order to print it as a pseudo-braille ascii text (it also has the print_8dot_brl property) Use it like this: >>> pb = PseudoBraille(u'â â '.encode('utf-8')) #<----note the encoding >>> pb.print_8dot_brl = False >>> print pb o . o . . o o o o . . . """ print_8dot_brl = True # set it to false to print 6dot pseudo braille def __str__(self): line=['','','',''] for c in self.decode('utf-8'): o = ord(c)-0x2800 if o>=0 and o<=0xff: line[0] += ('o' if (o & 0x01)<>0 else '.') + ' ' + ('o' if (o & 0x08)<>0 else '.') + ' ' line[1] += ('o' if (o & 0x02)<>0 else '.') + ' ' + ('o' if (o & 0x10)<>0 else '.') + ' ' line[2] += ('o' if (o & 0x04)<>0 else '.') + ' ' + ('o' if (o & 0x20)<>0 else '.') + ' ' line[3] += ('o' if (o & 0x40)<>0 else '.') + ' ' + ('o' if (o & 0x80)<>0 else '.') + ' ' else: if on_conv_err == ON_CONV_ERR_RAISE: raise ValueError,'Non unicode braille PseudoBraille string %s' % self else: line[0] += 'INV ' line[1] += 'ALI ' line[2] += 'D!! ' line[3] += 'CHR ' line[0] = line[0][:-2] # strip the last two spaces line[1] = line[1][:-2] line[2] = line[2][:-2] line[3] = line[3][:-2] if self.print_8dot_brl: return '%s\n%s\n%s\n%s\n' % (line[0],line[1],line[2],line[3]) else: return '%s\n%s\n%s\n' % (line[0],line[1],line[2]) ######################################################################## def test_braille_ascii(): '''unit testing -- some sanity tests''' for i in range(0,64): asc = _BrailleAscii__ascii[i] uni = _BrailleAscii__unicode[i] assert unibr_2_ascbr(uni) == asc assert ascbr_2_unibr(asc) == uni def unibr_2_ascbr(unibr, liberal_input = True): '''convert unibr(â ) to ascbr(R) liberal_input True means a) treat lower case letters as upper case b) pass invalid chars to output untranslated''' if len(unibr)<>1: return ''.join([unibr_2_ascbr(e) for e in unibr]) try: out = _BrailleAscii__ascii[ord(unibr)-0x2800] except: if on_conv_err == ON_CONV_ERR_RAISE: raise ValueError,'unibr_2_ascbr failed to convert %s' % unibr elif on_conv_err == ON_CONV_ERR_PASS: out = unibr else: out = on_conv_err_replace_char return out def unibr_2_dotbr(unibr, prefix = 'p', delimiter = '', suffix = ''): '''convert unibr(â ) to dotbr (eg p1235) Setting prefix to '(' delimiter to ',' and suffix to ')' will return another style of dotbr: "(1,2,3,5)". You can customize them as you wish. ''' if len(unibr)<>1: return ''.join([unibr_2_dotbr(e) for e in unibr]) o = ord(unibr)-0x2800 if o>=0 and o<=0xff: dots = prefix if (o & 0x01)<>0: dots += '1' + delimiter if (o & 0x02)<>0: dots += '2' + delimiter if (o & 0x04)<>0: dots += '3' + delimiter if (o & 0x08)<>0: dots += '4' + delimiter if (o & 0x10)<>0: dots += '5' + delimiter if (o & 0x20)<>0: dots += '6' + delimiter if (o & 0x40)<>0: dots += '7' + delimiter if (o & 0x80)<>0: dots += '8' + delimiter if dots == prefix: dots = prefix + '0' dots = dots.rstrip(delimiter) + suffix else: if on_conv_err == ON_CONV_ERR_RAISE: raise ValueError,'unibr_2_dotbr got non unicode braille char %s' % unibr elif on_conv_err == ON_CONV_ERR_PASS: dots = unibr else: dots = on_conv_err_replace_char return dots def unibr_2_psebr(unibr): '''convert unibr(â ) to list of psebr's ''' return PseudoBraille(unibr.encode('utf-8')) def ascbr_2_dotbr(ascbr): return unibr_2_dotbr(ascbr_2_unibr(ascbr)) def ascbr_2_psebr(ascbr): return unibr_2_psebr(ascbr_2_unibr(ascbr)) def ascbr_2_unibr(ascbr, liberal_input = True): '''convert ascbr(R) to unibr(â )''' if len(ascbr)<>1: return ''.join([ascbr_2_unibr(e) for e in ascbr]) try: out = unichr(0x2800 + _BrailleAscii__ascii.index(ascbr)) except: if on_conv_err == ON_CONV_ERR_RAISE: raise ValueError,'ascbr_2_unibr failed to convert %s' % ascbr elif on_conv_err == ON_CONV_ERR_PASS: out = ascbr else: out = on_conv_err_replace_char return out def dotbr_2_ascbr(dotbr, cell_delimiter='p', valid_chars='012345678,p() '): return unibr_2_ascbr(dotbr_2_unibr(dotbr, cell_delimiter='p', valid_chars='012345678,p() ')) def dotbr_2_psebr(dotbr, cell_delimiter='p', valid_chars='012345678,p() '): return unibr_2_psebr(dotbr_2_unibr(dotbr, cell_delimiter='p', valid_chars='012345678,p() ')) def dotbr_2_unibr(dotbr, cell_delimiter='p', valid_chars='012345678,p() '): '''convert dotbr(eg p1235) to unibr (â ) If dotbr contains a single cell then you can ignore the cell_delimiter If dotbr contains two or more cells you MUST specify the cell_delimiter: cell delimiter will be 'p' for dotbr like 'p123p1' cell delimiter will be ',' for dotbr like '123,1' cell delimiter will be either '(' or ')' for dotbr like '(123)(1)' or '(1,2,3)(1)' Every characters of dotbr must be within valid_chars Note that the code for this function is quite liberal in what it accepts E.g. passing "3120" will return the same output as passing "p123" ''' dotbr=dotbr.strip() if dotbr=='': return '' if cell_delimiter in dotbr: if len(dotbr.split(cell_delimiter))>2: return ''.join([dotbr_2_unibr(e) for e in dotbr.split(cell_delimiter) if e<>'']) for c in dotbr: if not c in valid_chars: if on_conv_err == ON_CONV_ERR_RAISE: raise ValueError,'dotbr_2_unibr found invalid char %s' % c elif on_conv_err == ON_CONV_ERR_PASS: return dotbr else: return on_conv_err_replace_char o = 0 if ('1' in dotbr): o += 0x01 if ('2' in dotbr): o += 0x02 if ('3' in dotbr): o += 0x04 if ('4' in dotbr): o += 0x08 if ('5' in dotbr): o += 0x10 if ('6' in dotbr): o += 0x20 if ('7' in dotbr): o += 0x40 if ('8' in dotbr): o += 0x80 return unichr( 0x2800 + o ) def strip_dots78(unibr): '''strips dots 7,8 from all cells of unibr''' if len(unibr)<>1: return ''.join([strip_dots78(e) for e in unibr]) return unichr(ord(unibr) & 0x283f) def is_six_dot(unibr): '''returns True if there are no dots 7/8 in input''' if len(unibr)<>1: return min([is_six_dot(e) for e in unibr]) return ((ord(unibr) & 0xC0) == 0) if __name__ == "__main__": test_braille_ascii() for o in range(0,256): unibr = unichr(0x2800 + o) #print o, hex(0x2800 + o), unichr(0x28ff)+ unibr, unibr_2_dotbr(unibr), dotbr_2_unibr(unibr_2_dotbr(unibr)) print u"%c%s %s %s" % (0x28ff,unibr, unibr_2_dotbr(unibr, '(',',',')'), is_six_dot(unibr)) print unibr_2_psebr(unibr) print assert dotbr_2_unibr(unibr_2_dotbr(unibr)) == unibr for o in range(0,256): if o % 8 == 0: print unibr = unichr(0x2800 + o) print '%9s=%s' % (unibr_2_dotbr(unibr), unibr), print ascbr_table = ''' ABCDEFGHIJ KLMNOPQRST UVXYZ&=(!) *<%?:$]\[W 1234567890 /+#>'- @^_".;, ''' for line in [i for i in ascbr_table.split('\n') if i.strip()<>'']: for ascbr in line: print '%s%s ' % (ascbr, ascbr_2_unibr(ascbr)), print for line in [i for i in ascbr_table.split('\n') if i.strip()<>'']: print unibr_2_psebr(ascbr_2_unibr(line)) print pass ascbr = 'ABCDEFGHIJ' unibr = ascbr_2_unibr(ascbr) dotbr = unibr_2_dotbr(unibr) print ascbr print unibr print dotbr print dotbr_2_unibr(dotbr) assert dotbr_2_unibr(dotbr) == unibr pb8 = unibr_2_psebr(unibr) pb6 = unibr_2_psebr(unibr) pb6.print_8dot_brl = False print 'printing 8dots psudo braille' print pb8 print 'printing 6dots psudo braille' print pb6 print u'â â ¬â ', '6dot braille' if is_six_dot(u'â â ¬â ') else '8dot braille' print u'â ⠬⢹', '6dot braille' if is_six_dot(u'â ⠬⢹') else '8dot braille' print strip_dots78(u'â ⠬⢹'), '6dot braille' if is_six_dot(strip_dots78(u'â ⠬⢹')) else '8dot braille'