defmodule Guaxinim.Internal.SourceProcessor do
alias Makeup.Lexers.ElixirLexer
alias Guaxinim.Utils.Tokens
alias Guaxinim.Internal.SourceParser
defp line_indentation(line) do
|
|
Strip the indentation and hash sign ( |
[_, _, _, comment] = Regex.run(~r/(\s*)#(\s?)(.*)/, line)
comment
end
defp strip_indent(line, 0), do: line
defp strip_indent(line, indent) do
case line do
<< _::binary-size(indent), rest::binary >> ->
rest
_ ->
line
end
end
defp merge_blocks_and_token_lines(blocks, token_lines) do
|
|
We’ll transform the blocks, consuming lines as we go. |
|
Code blocks: these blocks work with tokens and not raw lines of text.
They have no use for indentation, because they’ll be rendered inside |
{:code, lines}, {result, tok_lines} ->
|
|
Consume the appropriate number of lines: |
|
The blocks are added to the front of the list. They must be reversed afterwards! |
|
Heredoce: Guaxinim recognized 3 kinds of heredocs: |
{tag, lines} = block, {result, tok_lines} when tag in [:moduledoc, :doc, :typedoc] ->
{_, rest_token_lines} = Enum.split(tok_lines, length(lines) + 2)
indent = indentation(lines)
{[{tag, to_text(indent, block), indent} | result], rest_token_lines}
|
|
Comments: comments are similar to the heredocs above. They will be treated as Markdown and rendered into HTML. They will be indented too. |
{:comment, lines} = block, {result, tok_lines} ->
{_, rest_token_lines} = Enum.split(tok_lines, length(lines))
indent = indentation(lines)
{[{:comment, to_text(indent, block), indent} | result], rest_token_lines}
end)
|
|
Extract raw text from comment block: |
|
Extract raw text from heredocs ( |
defp to_text(indent, {tag, lines}) when tag in [:doc, :moduledoc, :typedoc] do
lines |> Enum.map(&strip_indent(&1, indent)) |> Enum.join("")
end
|
|
Extract raw text from code block (implemented for completeness, it’s not needed right now): |
|
Process the raw source of the file into blocks. |
def process_source(file, source) do
tokens = ElixirLexer.lex(source, [])
token_lines = Tokens.split_into_lines(tokens)
%{error: nil, result: blocks} = SourceParser.from_string(source)
merged_blocks = merge_blocks_and_token_lines(blocks, token_lines)
anchor_padding =
token_lines
|> length
|> Integer.to_string
|> String.length
%{file: file,
blocks: merged_blocks,
anchor_padding: anchor_padding}
end
end
|
Evaluate the indentation level of the line. Will only be used for comment blocks or for heredocs. We will never need to know the indentation level of a code block.
After having used ExSpirit to parse the file it might seem a little clunky to use the
Regex
module. The reason for this is that handling indentation in ExSpirit, although completely doable is somewhat complex, and would require us to define much stricter rules for what counts as a comment block.Parsing indentation in a post-processing step is the righ choice here.