{"id":4404,"date":"2022-03-02T08:13:24","date_gmt":"2022-03-02T16:13:24","guid":{"rendered":"https:\/\/coderpad.io\/?p=4404"},"modified":"2023-06-05T14:35:47","modified_gmt":"2023-06-05T21:35:47","slug":"developer-diaries-how-we-built-coderpad-monaco-ide","status":"publish","type":"post","link":"https:\/\/coderpad.io\/blog\/development\/developer-diaries-how-we-built-coderpad-monaco-ide\/","title":{"rendered":"Developer Diaries: How We Built a Better Browser-Based IDE with Monaco"},"content":{"rendered":"<p>Integrated development environments (IDEs) are probably the most important tool we use as software developers and engineers. Using another tool, like a notepad or word processor, would be like trying to hammer a nail with your fist.<\/p>\n\n\n<p>That\u2019s why our team just spent a good chunk of time working to <a href=\"https:\/\/coderpad.io\/blog\/product-updates\/introducing-coderpad-monaco\/\" target=\"_blank\" rel=\"noreferrer noopener\">improve the code editor<\/a> in the pads that software engineering candidates use to complete technical assessments \u2013 but it was anything but an easy feat.&nbsp;<\/p>\n\n\n<p>To create the best in-browser IDE experience, we leveraged much of the work from Microsoft\u2019s immensely popular Visual Studio Code (VSCode) IDE, but with a CoderPad twist\u2026<\/p>\n\n\n<p>VSCode runs inside of an environment called <a href=\"https:\/\/www.electronjs.org\/\" target=\"_blank\" rel=\"noopener\">Electron<\/a>, which allows a web environment to bind to native APIs, like <code>filesystem<\/code>. Therefore their core code editor, called Monaco, can run on the web.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c1dd11a6.png\" alt=\"\"\/><figcaption>VSCode uses Monaco as its editor, which we now do too!<\/figcaption><\/figure>\n\n\n<p>However, Monaco has a few limitations that we had to overcome when building an in-browser IDE. These limitations are in place because VSCode has access to native APIs, while Monaco on the web does not.<\/p>\n\n<p>Let\u2019s look at a few of these limitations, and what we did to solve them.<\/p>\n\n<aside class=\"\n    cta-banner\n     cta-banner--bg-blue      cta-banner--has-media \"\ndata-block-name=\"cta-banner\">\n    <div class=\"inner\">\n        <div class=\"content\">\n                            <h2 class=\"headline\">Learn how to run front-end developer interviews that don&#8217;t suck<\/h2>\n            \n                            <div class=\"cta-buttons\">\n                                    <a href=\"https:\/\/coderpad.io\/blog\/interviewing\/5-tips-for-interviewing-frontend\/\" class=\"button  js-cta--read-our-guide\" data-ga-category=\"CTA\" data-ga-label=\"Learn how to run front-end developer interviews that don&#039;t suck|Read our guide\">Read our guide<\/a>\n                                <\/div>\n                    <\/div>\n                    <div class=\"media\">\n                <img loading=\"lazy\" decoding=\"async\" width=\"432\" height=\"342\" src=\"https:\/\/coderpad.io\/wp-content\/uploads\/2022\/08\/Illustration-of-man-with-beard-popping-out-of-computer-chat.png\" class=\"attachment-large size-large\" alt=\"\" srcset=\"https:\/\/coderpad.io\/wp-content\/uploads\/2022\/08\/Illustration-of-man-with-beard-popping-out-of-computer-chat.png 432w, https:\/\/coderpad.io\/wp-content\/uploads\/2022\/08\/Illustration-of-man-with-beard-popping-out-of-computer-chat-300x238.png 300w\" sizes=\"auto, (max-width: 432px) 100vw, 432px\" \/>\n            <\/div>\n            <\/div>\n<\/aside>\n\n\n<h1 class=\"wp-block-heading\">Parsing, Lexing, and ASTs &#8211; Oh My!<\/h1>\n\n<p>Before we talk about the limitations, we need to quickly understand how your code is processed in order to generate compiled instructions.<\/p>\n\n<p>First, it\u2019s important to realize that when you type text into the computer it\u2019s just that: plain text. It\u2019s stored the same way that a <code>.txt<\/code> file is stored. Many compilers will even allow you to compile code directly from a <code>.txt<\/code> file.<\/p>\n\n<p>Because of this, we need a way for the compiler (even for runtime languages without an ahead-of-time compiler like JavaScript and Python) to have a better understanding of the context it handles.<\/p>\n\n<p>We do this by creating an Abstract Syntax Tree (AST). This AST allows your computer to read your code like an instruction book. Each section of this instruction book contains all the information a computer needs to transform your original source code into machine-readable code such as assembly. Your editor is also able to use this AST to provide various functions that enhance the editing experience.<\/p>\n\n<p>The AST is constructed by two tools: A lexer and a parser.<\/p>\n\n<h2 class=\"wp-block-heading\">Lexer<\/h2>\n\n<p>A lexer takes plain text and converts it into a list of \u201ctokens.\u201d These tokens are a small <code>allowlist<\/code> of values that can be better understood by the computer.<\/p>\n\n<p>For example:<\/p>\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript shcb-wrap-lines\"><span class=\"hljs-keyword\">const<\/span> magicNumber = <span class=\"hljs-number\">185<\/span>;<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n<p>Might be turned into a list of:<\/p>\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-2\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript shcb-wrap-lines\">&#91;CONSTANT_KEYWORD, IDENTIFIER, ASSIGN, INTEGER, SEMICOL]<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-2\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n<p>by the lexer.<\/p>\n\n\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c2034b7b.png\" alt=\"\" width=\"434\" height=\"387\"\/><figcaption>Each section of the string that comprises the code is converted to smaller sections called tokens. For example, \u201cconst\u201d would be a \u201cCONST_KEYWORD\u201d token.<\/figcaption><\/figure>\n\n\n<p>These tokens are defined by a \u201cgrammar\u201d that declares which characters should and should not be matched to a token in a language\u2019s source code.<\/p>\n\n<p>It\u2019s important to note that a tokenizer does not yet actually understand what your code is trying to do. This means that if we added an extraneous equals sign into our code, it would not yet throw an error, but instead tokenize just fine.<\/p>\n\n\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c20a4eba.png\" alt=\"\" width=\"410\" height=\"366\"\/><figcaption>When a duplicate token is added, say with two assignment equals signs, it won\u2019t throw an error at this stage. Instead, the lexer will tokenize them both without a problem.<\/figcaption><\/figure>\n\n\n<p>While the lexer is important, it only tells half the story behind a language&#8217;s syntax because of its inability to actually understand the language. This is where the parser comes into play.<\/p>\n\n<h2 class=\"wp-block-heading\">Parser<\/h2>\n\n<p>Once the lexer gathers a list of tokens, it\u2019s time for the parser to assemble an Abstract Syntax Tree (AST) from those tokens.<\/p>\n\n<p>The AST takes the tokens and adds in further context to fully understand what your code is attempting to say. E.g., our <code>magicNumber<\/code> example might be turned into a structure similar to the following:<\/p>\n\n\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c211ccf1.png\" alt=\"\" width=\"379\" height=\"580\"\/><figcaption>An AST is a tree-like structure that\u2019s defined in a pattern explained more below.<\/figcaption><\/figure>\n\n\n<p>This shows us that we\u2019re trying to:&nbsp;<\/p>\n\n\n<ol class=\"wp-block-list\"><li><code>const<\/code> assign a <code>VariableDeclaration<\/code>&nbsp;<\/li><li>with an <code>Identifier<\/code> of <code>magicNumber<\/code>, and&nbsp;<\/li><li>an <code>init<\/code>ial value of <code>185<\/code>, a <code>NumericLiteral<\/code><\/li><\/ol>\n\n\n<p>This parser is the step where syntax validation occurs. If we attempted to add back the second equals sign, we might get the following error:<\/p>\n\n<blockquote class=\"wp-block-quote\"><p>Uncaught SyntaxError: Unexpected token &#8216;=&#8217;<\/p><\/blockquote>\n\n<p>This error is telling us that while the lexer was able to identify the symbol, the parser did not expect said token where it was present.<\/p>\n\n<blockquote class=\"wp-block-quote\"><p>Want to learn more about syntax parsing and how your computer takes source code and converts it into a running program? <a href=\"https:\/\/unicorn-utterances.com\/posts\/how-computers-speak#lexer\" target=\"_blank\" rel=\"noopener\">I wrote an article to help explain this more in depth!<\/a><\/p><\/blockquote>\n\n<h1 class=\"wp-block-heading\">Syntax Highlighting<\/h1>\n\n<p>Using this parser-assembled AST, we\u2019re now able to easily add in syntax highlighting to our editor by running a tokenizer over our code and assigning each unique token a different color, right?<\/p>\n\n<p>This is true, but points towards some problems with Monaco\u2019s default tokenizer.<\/p>\n\n\n<p>See, <a href=\"https:\/\/code.visualstudio.com\/api\/language-extensions\/syntax-highlight-guide#tokenization\" target=\"_blank\" rel=\"noopener\">VSCode itself utilizes grammar files called \u201ctmLanguage\u201d files to tokenize your code<\/a>. This tokenizer is borrowed from <a href=\"https:\/\/macromates.com\/\" target=\"_blank\" rel=\"noopener\">TextMate<\/a>.&nbsp;<\/p>\n\n\n\n<p>However, parsing these files requires a C library called <a href=\"https:\/\/github.com\/kkos\/oniguruma\" target=\"_blank\" rel=\"noopener\">Oniguruma<\/a> to parse and execute these files. Because this library does not run in a pure JavaScript environment \u2013 which was a requirement for Monaco historically due to browser support \u2013 Monaco ships out-of-the-box with a different grammar language.<\/p>\n\n\n\n<p>This alternative syntax parser is called <a href=\"https:\/\/microsoft.github.io\/monaco-editor\/monarch.html\" target=\"_blank\" rel=\"noopener\">Monarch<\/a>, and unfortunately, does not yield as high-quality highlighting results as TextMate. Take a look for yourself:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"399\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/monarch-vs-textmate-1024x399.png\" alt=\"\" class=\"wp-image-4533\" srcset=\"https:\/\/coderpad.io\/wp-content\/uploads\/2022\/03\/monarch-vs-textmate-1024x399.png 1024w, https:\/\/coderpad.io\/wp-content\/uploads\/2022\/03\/monarch-vs-textmate-300x117.png 300w, https:\/\/coderpad.io\/wp-content\/uploads\/2022\/03\/monarch-vs-textmate-768x299.png 768w, https:\/\/coderpad.io\/wp-content\/uploads\/2022\/03\/monarch-vs-textmate.png 1457w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Monarch is displaying a majority of the code as plain text as opposed to TextMate\u2019s rendering, which shows much more information. This impacts screen-readers as well!<\/figcaption><\/figure>\n\n\n<p>Notice how most of the semantic value is lost in Monarch. TextMate, on the other hand, is able to showcase that <code>console<\/code> is an object with a method of <code>log<\/code>.<\/p>\n\n<p>Luckily for us, newer browsers ship with Web Assembly supported out-of-the-box, which means that we can compile Oniguruma from C to WASM and ship that with our IDE.<\/p>\n\n\n<p>More specifically, we\u2019re using `<a href=\"https:\/\/github.com\/microsoft\/vscode-oniguruma\" target=\"_blank\" rel=\"noopener\">vscode-oniguruma<\/a>`, which is made by Microsoft.<br>But this introduced yet another problem: Integrating TextMate into Monaco typically requires hacks of some kind, and introduces some flaws into the experience.<\/p>\n\n\n\n<p>To work around this, we actually <a href=\"https:\/\/www.npmjs.com\/package\/@codingame\/monaco-editor\" target=\"_blank\" rel=\"noopener\">forked the <code>monaco-editor<\/code> package<\/a> to make it easier to integrate with `<a href=\"https:\/\/github.com\/microsoft\/vscode-textmate\" target=\"_blank\" rel=\"noopener\">vscode-textmate<\/a>`, the very package VSCode uses to interpret TextMate packages.<\/p>\n\n\n<h1 class=\"wp-block-heading\">Language Services<\/h1>\n\n<p>Now that we have basic syntax highlighting explained, let\u2019s walk through how we were able to add in error displaying.<\/p>\n\n\n<p><a href=\"https:\/\/code.visualstudio.com\/api\/language-extensions\/language-server-extension-guide\" target=\"_blank\" rel=\"noopener\">VSCode utilizes something called a Language Service Provider<\/a> (LSP). When a user makes changes to a file, that file is then persisted to the filesystem. Then, whenever the user has an interaction that requires more metadata than the IDE currently knows about a file\u2019s syntax, it makes a request to this service in order to get said metadata.<\/p>\n\n\n<p>This works because the LSP is smarter than the AST the editor has on hand, thanks to additional processing done on the parsed code.<\/p>\n\n<p>For example, say the user wants to see the TypeScript definition for a variable they see. When they hover over the variable, it sends a request to the LSP server, including a bit of information to fulfill the request.<\/p>\n\n<p>Every time the user makes an edit to a file, the LSP server handles parsing the relevant files and uses its programmed logic to generate metadata.&nbsp;<\/p>\n\n<p>Then, when a user makes a request for this metadata, say, with a hover, an RPC connection returns the required metadata to show to the user about their requested TypeScript type information.<\/p>\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c221dd59.png\" alt=\"\"\/><figcaption>The language service provider parses the file every time after it\u2019s edited. Then, once data is requested, it\u2019s able to read from that parsed information to pass to the client.<\/figcaption><\/figure>\n\n\n<p>This service is also used to report errors, warnings, and other information that an IDE might usually surface to a developer. It also enhances autocomplete by providing the required metadata to make intelligent suggestions.<\/p>\n\n<p>We\u2019re also using an LSP, although we\u2019ve had to do things slightly differently from VSCode. VSCode has the advantage of running on a user\u2019s computer.&nbsp;<\/p>\n\n\n<p>Locally, the language server is able to run using native programming languages. However, since we\u2019re doing this in the browser, we don\u2019t have the ability to do this for all of <a href=\"https:\/\/coderpad.io\/languages\/\">our 30+ supported programming languages<\/a> without our bundle size exploding.<\/p>\n\n\n<p>To solve this problem, we set up LSP cloud servers to handle requests as if the user was running the editor on their computer.<\/p>\n\n<h1 class=\"wp-block-heading\">Extended Syntax Support<\/h1>\n\n<p>These language services aren\u2019t just helpful for error displays; they also allow us to enhance the syntax highlighting provided using \u201csemantic highlighting\u201d<\/p>\n\n\n<p>For example, this is what syntax highlighting looks like <em>without <\/em>semantic highlighting:<br><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c22a4668.png\" alt=\"\"\/><figcaption>Without semantic highlighting classes and methods are often not displayed properly.<\/figcaption><\/figure>\n\n\n\n<p>While this is what it looks like <em>with <\/em>semantic highlighting:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c22eff81.png\" alt=\"\"\/><figcaption>Classes are now displayed with the proper highlighting based on usage.<\/figcaption><\/figure>\n\n\n<p>Notice how the parameters that are passed in are highlighted the same color. Likewise, <code>Range<\/code> and <code>Position<\/code> are both acknowledged as classes, while <code>getFoldingRanges<\/code> is presented as a function, even when used as a parameter.<\/p>\n\n<p>This is all possible because the language service provides more metadata than is available to the AST. It\u2019s able to intelligently parse out this context in order to make the editing experience that much better.<\/p>\n\n\n<p>This feature is <a href=\"https:\/\/code.visualstudio.com\/api\/language-extensions\/semantic-highlight-guide\" target=\"_blank\" rel=\"noopener\">even utilized in VSCode<\/a>, and <a href=\"https:\/\/microsoft.github.io\/language-server-protocol\/specifications\/specification-current\/#textDocument_semanticTokens\" target=\"_blank\" rel=\"noopener\">well documented in the LSP specification<\/a>. Because we already utilize LSP, as mentioned earlier, we get this out-of-the-box with minimal additional lift.<\/p>\n\n\n<h1 class=\"wp-block-heading\">Additional Features<\/h1>\n\n<p>While we touched on a lot of features that developers expect, it\u2019s far from a complete list.<br><br>We also wanted to enable experiences like:<\/p>\n\n\n<ul class=\"wp-block-list\"><li>Different themes (currently light and dark modes, with more to come!)<\/li><li>VIM &amp; emacs modes<\/li><li>Keybindings and user configurations (coming soon)<\/li><li>Language-specific behaviors (such as COBOL magic tabs)<\/li><\/ul>\n\n\n\n<p>To make all of these experiences possible, <a href=\"https:\/\/www.npmjs.com\/package\/@codingame\/monaco-editor-wrapper\" target=\"_blank\" rel=\"noopener\">we\u2019ve added a wrapper package around Monaco \u2014 which we creatively call <code>monaco-editor-wrapper<\/code><\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/d2h1bfu6zrdxog.cloudfront.net\/wp-content\/uploads\/2022\/03\/img_621e5c236e063.png\" alt=\"\"\/><figcaption>Our Monaco editor wrapper contains the Monaco editor alongside language extensions and additional features.<\/figcaption><\/figure>\n\n\n<h1 class=\"wp-block-heading\">Conclusion<\/h1>\n\n\n<p>Hopefully, this has been an interesting insight into the nuts and bolts of our new Monaco editor. We hope you enjoyed reading about how we crafted the new experience as much as we did building it&#8211;we loved the challenge! If so, <a href=\"https:\/\/coderpad.io\/careers\/\">check out our careers page<\/a> for open engineering roles, and come work with us on fun problems like this.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Our team is rolling out our new CoderPad Monaco editor. It was a fun challenge to implement and I&#8217;d love to share how we did it!<\/p>\n","protected":false},"author":1,"featured_media":4506,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[9,13],"tags":[],"persona":[27,29],"blog-programming-language":[],"keyword-cluster":[],"class_list":["post-4404","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-development","category-product-updates"],"acf":[],"_links":{"self":[{"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/posts\/4404","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/comments?post=4404"}],"version-history":[{"count":25,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/posts\/4404\/revisions"}],"predecessor-version":[{"id":8071,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/posts\/4404\/revisions\/8071"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/media\/4506"}],"wp:attachment":[{"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/media?parent=4404"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/categories?post=4404"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/tags?post=4404"},{"taxonomy":"persona","embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/persona?post=4404"},{"taxonomy":"blog-programming-language","embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/blog-programming-language?post=4404"},{"taxonomy":"keyword-cluster","embeddable":true,"href":"https:\/\/coderpad.io\/wp-json\/wp\/v2\/keyword-cluster?post=4404"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}