I'm quite a new to blogging but I know a lot about the software development. Are these two activities that different? They look alike to me:
- You write down your ideas in a language of your choice.
- You need to adhere to some rules like grammar.
- When a post is done it is pushed to a public site.
Hey, that's exactly what software developers do all the time!
Let's have a look at how this idea can be put in use.
One of the most important decision is about the syntax used to write posts. Ideally the syntax should be very light so that the writer focuses on the content. Additionally it should have some capabilities to organize the document.
I've taken into consideration the following options:
- LaTeX + LaTeX2HTML
I know that my posts will need to be converted to HTML at some point in time so why not just start with it. This language has a great tooling with WYSWIG editors. Moreover, it is the most powerful option. With pure HTML I should be able to write anything I like.
The only problem is that it sometimes can be hard to read with many different tags obscuring the picture.
This is particularly visible when it comes to embedded source code.
The problem is even worse because in HTML there are two characters that demand special treatment:
Left angle brackets are used to start tags whereas ampersands are used to denote HTML entities.
In order to use them as literal characters, it is necessary to escape them as entities, e.g.
Even if they are inserted by an editor they will exist in the source, thus making it harder to read.
LaTeX + Tex4HT
An alternative solution, which should be attractive to all academics, is the LaTeX. With some tools like HTLaTeX it is possible to compile documents to HTML.
I studied at a university where using LaTeX it is not mandatory but most of the instructors preferred this format. Therefore, I learned how to write LaTeX articles before I grasped the HTML. I still had the right environment which was a portable distribution of MikTeX with some of my favourite packages. In order to evaluate this option I created a sample post which contained some source code listings and images.
During that process I found few resources which were particularly helpful:
- UK List of TeX Frequently Asked Questions
- TeX Users Group
- StackExchange Tex
Although it worked, it wasn't easy. The biggest problem I had was with the source code formatting. The listings package I used to produced beautiful listings in a PDF but when I run the 'tex4ht' they all looked much worse. Fonts had no longer constant width and nothing was aligned as it should. It turned out that listings is quite advanced and it has it's own algorithm to organize everything into columns. That's how this works with any any font you use but this solution didn't worked for HTML.
I fixed it by changing the base font to be typewriter ('/ttfamily'). Everything was aligned again but I lost the bold style of the keywords. It made me think about using using colors instead. After all, this document won't be printed!
I found a very interesting answer at the StackExchange.
This is the key idea:
Imho a more simple approach is with fonts: if every style is connected to a different font then tex4ht surrounds the chars with classes which you can set through css.
It worked like charm. I still had to compile the document few times and inspect the html to capture all the class names that I need to use in the css but it fast. Eventually I came up with a nicely colored source code listing.
In summary, the experience wasn't bad but I had a feeling that I had to search for solutions and workarounds too often. I decided to look for something different.
The third option that I took into consideration is Markdown. Initially I've been using it at StackExchange without knowing its name. I would just write a question or an answer and discover the syntax accidentally by typing and looking at the 'Preview' section. It was possible because of the philosophy it was created with.
Markdown is intended to be as easy-to-read and easy-to-write as is feasible.
Readability, however, is emphasized above all else. A Markdown-formatted document should be publishable as-is, as plain text, without looking like it's been marked up with tags or formatting instructions.
-- John Gruber
One of the biggest advantage of this syntax, which was probably also admired by the creators of StackOverflow, is the code block. You can just paste your code inline, indent it by 4 spaces or 1 tab and it will be properly formatted. No tags, no commands, just indentation - perfect!
Most of the code I write is in C#. Because classes are defined inside a namespace block they are indented by default so there is no extra action needed. The code can be copied as it is and it will be converted into HTML.
This was one of the reasons why I selected Markdown as the language for my blog.
I believe that the version control system plays very important role in every software project. Whenever I start something new and I know that it stick around for longer than a day I create a repository. Blog definitely falls into this category. The decision about which VC system to use was quite easy. For all the private work I use Git. I've got all the tools installed and account on a GitHub to backup my repository.
With a public repository there is theoretical a chance that somebody will send me a pull request to fix something in the post but I don't think this will happen. Not because the code I write is flawless but blog is a personal thing. Nevertheless, contributions are more than welcome.
In summary, I will write all my posts using Markdow syntax. They will under version control system and available in two different places. Sources will be saved in the GitHub and the HTML version will be published in the Blogger.