Skip to content

Code fragment context building functionality #12566

@josevalim

Description

@josevalim

Hi @scohen and @scottming! (cc @lukaszsamson)

I would discuss some of the context building functionality that you mentioned. I have specific questions in this document that I would like your input on.

When we are working on a file, we need to provide completion on variable names, imports and so on. For example, take this buffer:

defmodule Foo do
  import Baz

  def bat do
    var = 123
    {<CURSOR>
  end

  def local_function do
    ...
  end
end

Where <CURSOR> is where the mouse cursor is. In order to provide completion, we need three things:

  1. A parser that is robust to syntax failures

  2. Build a local context with imports, variables, aliases, local functions, etc

  3. Understand which kind of completion we should provide (alias? function call? variable name?).

We only provide step 3 at the moment via cursor_context. My goal for now is to focus on step 2 (and then we can discuss step 1). To build a local context, we have 3 main sources of data: lexical, variables, and module data. Let's discuss them one by one.

import, aliases, requires

This is the easiest information to gather. It can be done by 1. traversing the AST and then 2. defensively expanding all AST node, collecting all imports, aliases, and requires along the way.

Building the AST can be done with container_cursor_to_quoted. We don't have at the moment a safe way to expand its nodes. This could be provided.

Variables

Variables are trickier because they have more complex scoping rules than imports, aliases, requires. There are two ways we could attempt to collect variables:

  1. Simply traverse the AST and collect all {atom, meta, atom} nodes.

  2. Try to use the same approach as import, aliases, requires, but that would be tricky and I suspect it would also be brittle.

QUESTION: Which approach do you prefer? 1 definitely sounds simpler.

QUESTION: The biggest issue with collecting variables in both cases is deciding when a function starts. If we have this code:

some_var = 123

def bat do
  {<CURSOR>
end

some_var is not immediately visible inside bat. It is easy to detect those cases for def, defp and friends, but remember anyone can define their own "def"-like function, like Nx defines defn. So how to handle this? One option is to ignore the problem and show "some_var" anyway, for two reasons: they are not common, so the rate of false positive is low, and they can be accessed inside a function if we use unquote.

Module information

Something else we may want to complete is local function names and module attributes. Local function names is the trickier one, so let's focus on that.

All information so far (imports, aliases, variables) have to be defined before the cursor. Local functions are trickier because, even if local_function is defined after the cursor, we may still want to suggest it. How to gather this information?

  1. One option is to only suggest local functions defined in a previously compiled version of this module. So if we have defmodule Foo do previously compiled and it had def local_function in it, we know it exists and we can suggest it.

  2. Another option is to provide a heuristic. Similar to variables, we can look at the AST and collect all nodes {atom, meta, list}. If those nodes exist, it means they were either: previously imported, which we can check on the imports information, or they were locally defined. This has two additional complexities:

    • Given we want to traverse the whole AST, it is not enough to build a parser that builds the AST up until cursor. We need to always attempt to build the AST for the whole file (see Add string_to_quoted in Code.Fragment #11967)

    • Because of macros such as |>, we cannot look at {atom, meta, list} to fetch the arity because the macro will add one additional argument. So we need a heuristic to expand certain AST nodes in order to fetch the real arity (a simple heuristic is to expand all operators)

QUESTION: I believe the best solution in this case is actually a mixture of both 1 and 2. Use the heuristic to build as much information as possible and then refine the heuristic with information from a compiled version of the module (for example, the compiled version can have docs, types, metadata, etc). Do you agree? Anything to add?

Any additional thoughts?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions