Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Go support #153

Open
wants to merge 4 commits into
base: main-dev
Choose a base branch
from
Open

Conversation

MarkReedZ
Copy link

Added a few functions for go. If we merge this I'll finish it up.

$ go run scripts/bench.go
Contains
   38.798µs 	strings.Contains
   37.872µs 	sz.Contains
Index
   26.29µs 	strings.Index
   37.712µs 	sz.Index
IndexAny
   588.567µs 	strings.IndexAny
   45.911µs 	sz.IndexAny

@ashvardanian
Copy link
Owner

Wow, I am surprised the latency is comparable. How long is the input?

I believe for Go, it would be great to transpile our C code.

@MarkReedZ
Copy link
Author

MarkReedZ commented May 17, 2024

I'll look at transpiling. I don't expect there to be a significant improvement on these benchmarks - its probably likely to slow things down considerably from what I've read.

The strings are 1mb. Below are 10kb strings. Short strings show the same performance except for IndexAny where the go time eventually matches stringzilla.

$ go run bench.go
Contains
   607ns 	strings.Contains
   804ns 	sz.Contains
Index
   324ns 	strings.Index
   492ns 	sz.Index
IndexAny
   5.917µs 	strings.IndexAny
   733ns 	sz.IndexAny

@ashvardanian
Copy link
Owner

How about super short inputs in the range of 10-100 bytes?

@MarkReedZ
Copy link
Author

None of the transpilers work on the stringzilla code. I don't see much information on them.

Build flags are the following by the way. -O3 slows us down.

// #cgo CFLAGS: -g -mavx2
// #include <stdlib.h>
// #include <../../include/stringzilla/stringzilla.h>
import "C"

100b

Contains
   245ns 	strings.Contains
   425ns 	sz.Contains
Index
   139ns 	strings.Index
   137ns 	sz.Index
IndexAny
   181ns 	strings.IndexAny
   275ns 	sz.IndexAny

16b

Contains
   138ns 	strings.Contains
   452ns 	sz.Contains
Index
   56ns 	strings.Index
   198ns 	sz.Index
IndexAny
   205ns 	strings.IndexAny
   262ns 	sz.IndexAny

@MarkReedZ
Copy link
Author

Others benchmarking cgo state its 40ns per C call, but I don't see it being that slow. I'll get more of the funcs in as I'm using this in some of my go projects. We can also try manually writing a func or two in go to see how that goes.

@MarkReedZ
Copy link
Author

@ashvardanian
We do have an issue with repeated c calls. Count is faster until we match too many times. It is ~50ns per call to sz_find. We can?

A) Write Count in C so its one function call to C instead of N calls
B) Write sz_find in go

Let me know and I'll go ahead. The below numbers are for a 9 byte substring to match - strings.Count is faster if the substring to search for is 4 bytes or less.

Count 1  
   55.06µs 	strings.Count
   40.692µs 	sz.Count
Count 100,000 
   1.344558ms 	strings.Count
   6.815231ms 	sz.Count

@ashvardanian
Copy link
Owner

How about we implement the count in the cGo header?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants