Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Support constraining via context free grammar for dsls #304

Open
DanielProkhorov opened this issue Jan 1, 2024 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@DanielProkhorov
Copy link

Hi @lbeurerkellner,

Do you have any plans to "natively" integrate token constraint into the lmql language, perhaps through ATLR/Lark/ENBF grammar notation? This is a feature currently supported by guidance (https://github.com/guidance-ai/guidance?tab=readme-ov-file#context-free-grammars) and outlined in examples from other projects like outlines (https://github.com/outlines-dev/outlines?tab=readme-ov-file#using-context-free-grammars-to-guide-generation).

@lbeurerkellner lbeurerkellner added the enhancement New feature or request label Jan 1, 2024
@lbeurerkellner
Copy link
Collaborator

We have some plans, but there is no concrete ETA currently. I will keep the issue to track the status.

Do you have any concrete use cases in mind?

@DanielProkhorov
Copy link
Author

Do you have any concrete use cases in mind?

Yes, I do. I recently was playing around with a DSL for HMI testing within the automotive domain. Using guidance my script look like the following:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import guidance
from guidance import models, system, user, assistant, gen, select, one_or_more

checkpoint = "HuggingFaceH4/zephyr-7b-beta"
lm = models.TransformersChat(checkpoint, device_map="auto", torch_dtype=torch.bfloat16)

@guidance(stateless=True)
def enter_teststep(lm):
    return lm + "Enter " + gen(stop="into", max_tokens=20) + " into the " + gen(max_tokens=3) + "."

@guidance(stateless=True)
def tap_teststep(lm):
    return lm + "Tap the " + gen(stop="button", max_tokens=3) + " button"

@guidance(stateless=True)
def modification_teststep(lm):
    return lm + select(["Activate", "Deactivate"]) + " the " + gen(stop="option", max_tokens=3) + " option"

@guidance(stateless=True)
def slider_modification_teststep(lm):
    return lm + select(["Increase", "Decrease"]) + " the " + select(["vertical", "horizontal"]) + gen(stop="slider", max_tokens=3) + " slider by " + one_or_more(select(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])) + gen(max_tokens=2)

@guidance(stateless=True)
def selection_teststep(lm):
    return lm + "Select " + gen(stop="from", max_tokens=5) + " from the list"

@guidance(stateless=True)
def teststep(lm):
    return lm + select([enter_teststep(), tap_teststep(), modification_teststep(), slider_modification_teststep(), selection_teststep()])

system_msg = "You are an expert in software testing within the automotive infotainment domain. Additionally, your understanding of the ANTLR grammar is enormous."

prompt = """Antlr Grammar for Test Case Description DSL in the Infotainment Domain:

grammar TestCase;

testcase: 'Testcase:' ID
  'Preconditions:' teststeps*
  'Actions:' teststeps+
  'Postconditions:' teststeps*;

teststeps: enterStep
  | tapStep
  | modificationStep
  | sliderModificationStep
  | selectStep;

enterStep: 'Enter the' OBJECTNAME 'into the' TARGETNAME;

tapStep: 'Tap the' OBJECTNAME 'button';

modificationStep: (Activate | Deactivate) 'the' OBJECTNAME 'option';

sliderModificationStep: (Increase | Decrease) 'the' ORIENTATION OBJECTNAME 'slider by' NUMBER UNITS;

selectStep: 'Select' OBJECTNAME 'from the list';

OBJECTNAME: ID;
TARGETNAME: ID;
ORIENTATION: 'vertical' | 'horizontal';
UNITS: ID;
NUMBER: DIGIT+;
Activate: 'Activate';
Deactivate: 'Deactivate';
Increase: 'Increase';
Decrease: 'Decrease';

ID: [a-zA-Z]+;
DIGIT: [0-9];

User Test Case:

Testcase: Check bass slider from -10 to 10

Preconditions:
Tap the settings button
Tap the tone settings button

Predict the next logical test step using the grammar rules.

The next logical test step is the following action:
- 
"""

with system():
    llm = lm + system_msg

with user():
    llm += prompt

with assistant():
    llm += teststep()

resulting in the following response:

Screenshot 02 01 2024 um 13 10 05 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants