A Princeton professor, discovering somewhat time for himself in the summertime tutorial lull, emailed an previous good friend a pair months in the past. Brian Kernighan stated hiya, requested how their US go to was going, and dropped off hundreds of lines of code that might add Unicode assist for AWK, the text-parsing software he helped create for Unix at Bell Labs in 1977.
“I’ve examined this a good quantity however clearly extra assessments are wanted,” Kernighan wrote within the electronic mail, posted in late Could as a sort of pseudo-commit on the onetrueawk repo by longtime maintainer Arnold Robbins. “As soon as I determine how … I’ll attempt to submit a pull request. I want I understood git higher, however despite your assist, I nonetheless haven’t got a correct understanding, so this will take some time.”
Kernighan is the “Ok” in AWK, a special-purpose language for extracting and manipulating language that was key to Unix’s pipeline options and interoperability between methods. A working
awk operate (AWK is the language,
awk the command to invoke it) is essential to each Commonplace UNIX Specification and IEEE POSIX certification for interoperability. There are numerous variants of
awk—together with fashionable derivations with assist for Unicode—however “One True AWK,” generally often called
nawk, is a sort of canonical model based mostly on Kernighan’s 1985 guide The AWK Programming Language and his subsequent enter.
Kernighan can also be the “Ok” in “Ok&R C,” the foundational 1978 guide The C Programming Language he cowrote with Dennis Ritchie that sticks with programmers, mentally and in dog-eared paper kind. C’s roots go a lot deeper. Kernighan had been educating C to staff at Bell Labs and satisfied its creator, Ritchie, to collaborate on a guide to unfold the information. That guide gave beginning to “the one true brace model,” the countless debate that goes with it, and the construction underpinning each fashionable programming language.
Kernighan additionally named Unix and first demonstrated the “Hi there, world” code instance. He spoke with Ars Technica’s Richard Jensen for a fiftieth anniversary historical past of Unix.
The onetrueawk repository, the place Kernighan appeared in late Could, is a comparatively quiet place, with 21 contributors, 46 GitHub customers watching, and commits coming each few months. As famous by The Register, Kernighan’s Unicode repair got here to gentle principally as a result of it was talked about in an interview with the professor by YouTube channel Computerphile.
“It is all the time been a humiliation that AWK solely labored with ASCII, or possibly 8-bit inputs, but it surely does not actually deal with Unicode in any respect,” Kernighan tells interviewer professor David Brailsford. “A number of months in the past, I spent a while working with (laughs) an extremely previous program. I’ve it at this level the place it’s going to really deal with UTF-8 enter and output as a way to have common expressions that, you understand, decide up Japanese characters, issues like that.”
Kernighan, now 80, offhandedly mentions within the interview that he has additionally patched one thing “fast and soiled” to let AWK deal with CSV recordsdata.