1) In the regexp, if you are not going to be using a matching group, you can mark it as a non-matching group using '?:', ie /(?:this_group_wont_match)/, so you can use $2, $1, instead of $3, and $2. In your case it would look like this: /\<code(?: lang="(.+?)")?\>(.+?)\<\/code\>/m.
2) Changing code tag into div may not be a good idea if you're planning to have both inline and block code. code tag is by itself an inline element. To use block code it may be better to wrap code in pre.
@Matt, using Rack middleware is an interesting solution for this. I haven't looked into it but does performance become a problem? I imagine it needs to look through every HTML page even if there is no <pre> block on it, and is it possible to cache the results?
@hakunin, thanks for the notes and very good points. I always forget the syntax for non matching parenthesis.
@grigio, I'm not very familiar with these editors, but you just need to determine what they generate for the code syntax and swap that out like I do with the <code> tag in this episode.
@Memiux, I'm experimenting with other video codecs because, unfortunately, the Animation codec which I normally use is very buggy in Snow Leopard. I hope to find a solution which is nearly as good quality as Animation.
I was wondering if there were any reasons you chose the ruby-based libraries over javascript libraries, other than the rails-specific target of the screen-cast series?
Hey Ryan,
Thanks for the informative screencast. Looking at the benchmarks, I was wondering how much of Pygments slow performance is from shelling and how much is due to Pygments itself.
I'm not very familiar with this benchmarking stuff though (the next Railscast?:-). On my machine coderay is about the same, 0.24 total and pygmentize is 9.28.
Not to discredit Pygments, I wrote up the following snippet, which may not be that useful for Rails developers. It returns 0.496517 usec to run Pygments 50 times from Python. Not sure exactly how to get more consistent benchmarks between the two.
Nice screncast, as usual :-) But one recommendation would be, instead of re-parsing the code snippet at every request and caching it in the view, it is better to add the parsing as a before_save hook in the model and store the final HTML already pre-rendered in another text column in your table. Then the view has only to display this column instead of the original and because of this, it won't matter which library is faster because you only have to run it once and not at every request.
I ended up using ultraviolet since I love the textmate themes anyway (I'm actually a Emacs+Mac guy myself). My blog isn't moved over from Wordpress to my own custom software yet, but once it is it's going to use ultraviolet.
Just so others know, Heroku has the required libraries for Ultraviolet already installed which make using it a snap :)
Thank for covering CodeRay - you did an outstanding job of explaining its strengths, weaknesses, and how to use it.
Only one addition: There's a built-in option to combine called (require) "coderay/for_redcloth" which enables the textile-style @[lang]...@ and bc[lang] shortcuts for syntax highlighting.
I have not been able to make it work with Rails 3.
I tried using your suggestions from the XSS screencast with combinations of html_safe and raw, but so far no luck. I can get it to output it escaped once or twice, but never unescaped!
just to let you know that I've build a new syntax highlighting engine in Ruby. It's called Prism and you'll find it here: http://github.com/kib2/Prism.
I've also made a little pastebin with it : http://prism-pastebin.heroku.com/.
The sample you gave renders the following: http://prism-pastebin.heroku.com/13
Thanks for your comment, finally I can let it run on Rails 3.0.0.rc.
But I also notice that you're using raw helper method, which I think is dangerous because any script in the article content will run. It's better to use sanitize helper method.
Coderay documentation sucks so much. Where can I find which $number maps to which langauge? What about other options for the div? How do programmers even find this stuff out? Digging through the mess of the source code?
What the heck is textile, and why is it impossible to find any source that actually explains it, instead of just off-handedly mention it as if everyone knows what it is, and when you use it? Why do we need redcloth and we do we need textile? Assumpetions.
Do they even want people to use it? Why would you not make the documentation better. Why make it so complicated, and obscure to find out how to use it?
Thanks!
Another method (via vim):
http://gist.github.com/347644
It script save code highlight to html file in current dir.
A couple of notes.
1) In the regexp, if you are not going to be using a matching group, you can mark it as a non-matching group using '?:', ie /(?:this_group_wont_match)/, so you can use $2, $1, instead of $3, and $2. In your case it would look like this: /\<code(?: lang="(.+?)")?\>(.+?)\<\/code\>/m.
2) Changing code tag into div may not be a good idea if you're planning to have both inline and block code. code tag is by itself an inline element. To use block code it may be better to wrap code in pre.
Does it also work with FCKEditor or another WYSIWYG editor?
@Matt, using Rack middleware is an interesting solution for this. I haven't looked into it but does performance become a problem? I imagine it needs to look through every HTML page even if there is no <pre> block on it, and is it possible to cache the results?
@hakunin, thanks for the notes and very good points. I always forget the syntax for non matching parenthesis.
@grigio, I'm not very familiar with these editors, but you just need to determine what they generate for the code syntax and swap that out like I do with the <code> tag in this episode.
@Memiux, I'm experimenting with other video codecs because, unfortunately, the Animation codec which I normally use is very buggy in Snow Leopard. I hope to find a solution which is nearly as good quality as Animation.
Ryan, thanks for another good tutorial.
It's seem a good step for highlight search result, am I right?!
I thought the method looks similar.
I was wondering if there were any reasons you chose the ruby-based libraries over javascript libraries, other than the rails-specific target of the screen-cast series?
Hey Ryan,
Thanks for the informative screencast. Looking at the benchmarks, I was wondering how much of Pygments slow performance is from shelling and how much is due to Pygments itself.
I'm not very familiar with this benchmarking stuff though (the next Railscast?:-). On my machine coderay is about the same, 0.24 total and pygmentize is 9.28.
Not to discredit Pygments, I wrote up the following snippet, which may not be that useful for Rails developers. It returns 0.496517 usec to run Pygments 50 times from Python. Not sure exactly how to get more consistent benchmarks between the two.
http://gist.github.com/349504
Is there an update on the pygments yet? Would also like to know this info.
Nice screncast, as usual :-) But one recommendation would be, instead of re-parsing the code snippet at every request and caching it in the view, it is better to add the parsing as a before_save hook in the model and store the final HTML already pre-rendered in another text column in your table. Then the view has only to display this column instead of the original and because of this, it won't matter which library is faster because you only have to run it once and not at every request.
Great video (as they all are).
I ended up using ultraviolet since I love the textmate themes anyway (I'm actually a Emacs+Mac guy myself). My blog isn't moved over from Wordpress to my own custom software yet, but once it is it's going to use ultraviolet.
Just so others know, Heroku has the required libraries for Ultraviolet already installed which make using it a snap :)
http://mxkelsin.heroku.com/2010/04/01/installing-ultraviolet-on-my-mac/
Thank for covering CodeRay - you did an outstanding job of explaining its strengths, weaknesses, and how to use it.
Only one addition: There's a built-in option to combine called (require) "coderay/for_redcloth" which enables the textile-style @[lang]...@ and bc[lang] shortcuts for syntax highlighting.
I have not been able to make it work with Rails 3.
I tried using your suggestions from the XSS screencast with combinations of html_safe and raw, but so far no luck. I can get it to output it escaped once or twice, but never unescaped!
Anyone made it work for Rails 3?
I got it to work with Rails 3.
def coderay(text)
text.gsub!(/\<code(?: lang="(.+?)")?\>(.+?)\<\/code\>/m) do
code = CodeRay.scan($2, $1).div(:css => :class)
"<notextile>#{code}</notextile>"
end
return text.html_safe
end
And to override the scaffold pre style, added background-color: black
.CodeRay pre {
margin: 0px;
padding: 0px;
background-color: black;
}
Hi,
just to let you know that I've build a new syntax highlighting engine in Ruby. It's called Prism and you'll find it here: http://github.com/kib2/Prism.
I've also made a little pastebin with it : http://prism-pastebin.heroku.com/.
The sample you gave renders the following: http://prism-pastebin.heroku.com/13
Please report me any bug/problem, see you.
To have the correct output with Rails 3.0.0.rc replace the following line :
content_tag("notextile", CodeRay.scan($3, $2).div(:css => :class))
by :
content_tag('notextile', CodeRay.scan($3, $2).div(:css => :class).html_safe)
(simply add html_safe)
<%=raw textilize(coderay(@article.content)) %>
(simply add raw)
To 53:@Batzooh
Thanks for your comment, finally I can let it run on Rails 3.0.0.rc.
But I also notice that you're using raw helper method, which I think is dangerous because any script in the article content will run. It's better to use sanitize helper method.
<%=sanitize coderay(@article.content) %>
full fix =)
CodeRay has built-in support for RedCloth: http://stackoverflow.com/a/8850113/132257
Coderay documentation sucks so much. Where can I find which $number maps to which langauge? What about other options for the div? How do programmers even find this stuff out? Digging through the mess of the source code?
What the heck is textile, and why is it impossible to find any source that actually explains it, instead of just off-handedly mention it as if everyone knows what it is, and when you use it? Why do we need redcloth and we do we need textile? Assumpetions.
Do they even want people to use it? Why would you not make the documentation better. Why make it so complicated, and obscure to find out how to use it?