Categories
TeaScript

Release of TeaScript 0.13.0 πŸ—

TeaScript 0.13.0 was published on the 4th March in 2024 and can be downloaded for free:

UPDATE: In the meanwhile a new release with new features has been done. Check out the download page!

All details, features and changes for this new release are following in the blog post below.

Infos and Links

The download page with more infos, links and basic instructions:
β˜›β˜›β˜› Download Page. ☚☚☚

Browse the source code of the TeaScript C++ Library on Github.

Are you new to TeaScript?
Then you may read here first: Overview and Highlights of TeaScript.

What is new?

Here is a summary of the new main features for version 0.13.

This release comes with a new integrated Buffer type (as contiguous memory accessible by C++ as a raw std::vector<unsigned char>!) alongside with a huge battery of utility and buffer support functions.

Read below about how easy it is to write (and read) arbitrary binary files (a Bitmap file as example) with TeaScript with the help of the new Buffer.

Find out who was the winner of the benchmark for filling a 32 bit RGBA buffer pixel by pixel: ChaiScript VS TeaScript.

Also, this version comes along with a nice and handy way to process and deal with UTF-8 encoded strings and its Unicode glyphs.

Further more 0.13 adds 2 new integral types, U64 and U8, suffixes for all Number types/literals, integral hex literals with 0x and bit operators (bit_and, bit_or, bit_xor, bit_not, bit_lsh and bit_rsh).

Additionally a cast operator is now available with as.

Last but not least TeaScript 0.13 is published under a new license.

Example code

For those of you who are liking more to discover and learn new features by studying real code, and for those who want to try everything by themself, here is some example code.

Hint: There is a SyntaxHighlighting for Notepad++ for an easier reading of TeaScript code.

Use either the teascript_demo application from Github or the pre-compiled (source code is included) and free TeaScript Host Application for Windows and Linux from the download section / the links at top to execute the tea files.

New License

The TeaScript C++ Library is now licensed under the MPL-2.0 (Mozilla Public License 2.0).

The Mozilla Public License can be read here https://www.mozilla.org/en-US/MPL/2.0/

The new license has the following advantages for TeaScript and for its users:

  • It is explicit compatible with AGPL, GPL and LGPL.
  • It is compatible with Apache, MIT, BSD, Boost licenses (and others).
  • Larger works (means Applications using TeaScript) can be distributed closed source (or under a compatible license) as long as the conditions are fulfilled.
  • It can be linked statically (if all conditions are fulfilled).
    (that is why LGPL was not chosen. TeaScript is not available as DLL.)
  • For the upcoming module system it means that a new first- or third-party module for TeaScript may use any compatible license like MPL-2.0, (A)GPL, MIT, Apache and so on.

Many questions regarding the MPL are also answered in the official MPL 2.0 FAQ

If you have further questions regarding the new license don’t hesitate to contact me.

Easy UTF-8 String processing

(TeaScript language feature)

Starting with this version TeaScript comes with a nice and handy way for dealing with and processing arbitrary UTF-8 encoded Strings (in TeaScript all Strings are always UTF-8 encoded!).

fixed _strat() – the foundation

The foundation for this is the changed CoreLibrary function _strat() (TeaScript) / CoreLibrary::StrAt() (its C++ representation).

This function returns the requested position of a given String as a String, e.g.: def s := _strat( "abc", 1 ) // s is "b"

Before it has only delivered the requested single byte which was problematic. Because of a Unicode glyph encoded as UTF-8 can be made of either 1, 2, 3 or 4 bytes – depending on the glyph, the returned String was an invalid encoded UTF-8 String if the requested byte was from a glyph which uses more than one byte. Basically said, this function was only useful for ascii Strings (ascii is a subrange of UTF-8 consisting of only one byte with values from 0 to 127).

The behavior has changed now.
_strat() now delivers always a full Unicode glyph as an UTF-8 encoded String (or an empty String if out of range).
Even if the requested position points in the middle of a Unicode glyph encoded with more than 1 byte, the function returns the corresponding complete Unicode glyph!

As a result the size of the returned String is now in the range [0,4] depending of the size of the encoded Unicode glyph. Thanks to the SSO (Small String Optimization) of all major C++ Standard Libraries there isn’t any heap allocation done for this String object (and depending on the usage also not at TeaScript C++ level).

This means that all of these changes don’t introduce any performance penalties from a memory allocation point of view. The eventually searching of the start / end of the glyph is negligible because the maximum bytes to investigate are limited to 4.

The utf8_iterator

TeaScript offers now 3 utility functions for iterating over an UTF-8 encoded String: utf8_begin( str ), utf8_end( it ) and utf8_next( it ).

utf8_begin( str ) returns a (Named) Tuple acting as an iterator. The cur element of the Tuple (e.g., it.cur) always represents the current full Unicode glyph.

utf8_end( it ) checks whether the iterator is at end of the String already.

utf8_next( it ) advances the iterator to the next complete Unicode glyph and sets the cur element accordingly.

The best way to illustrate it, is to read and try the provided test scripts example_v0.13.tea and corelibrary_test04.tea.
Hint: There is a SyntaxHighlighting config for Notepad++ for an easier reading of TeaScript code.
Here follows only a quick example:

def str := "₱€₾₿" // some currency symbols (4 glyphs, 12 bytes)

def it := utf8_begin( str )  // creates the iterator
println( it.cur )            // "β‚±"
it := utf8_next( it )        // advancing
println( it.cur )            // "€"
it := utf8_next( it )        // advancing
println( it.cur )            // "β‚Ύ"
it := utf8_next( it )        // advancing
println( it.cur )            // "β‚Ώ"

if( utf8_end( it ) ) {
    println( "End of String." )
}

Miscellaneous UTF-8

strtrim() now supports the full Unicode range for the given set of glyphs to trim away (see corelibrary_test04.tea).

Use _strglyphtobytepos() to get the starting byte of a desired glyph from a String.

_substr() as well as _strreplacepos() will now reject if the given range does not form a valid complete UTF-8 String.

Integral types, hex literals and casting

(TeaScript language feature)

TeaScript has now 2 (new) unsigned integral number types, namely U8 for represent a single 8 bit byte and U64 which is a 64 bit wide unsigned integer. All numbers can now be explicit set to a specific type by using one of the following suffixes: u8, u64, i64, f64. Also, the integer types can be written as hex literals with the prefix 0x.

Use the new binary cast operator as for cast between all number types and other supported types like Bool and String.

def a  :=  1u8   // 8 bit byte
def b  :=  0u64  // 64 bit unsigned

def c  :=  -1i64 // signed 64 bit
def d  :=   1    // default is signed 64 bit.

def e  :=  0xcafeCAFEu64 // unsigned 64 bit in hex notation

def f  :=   5f64  // 64 bit floating point.

// type checking
a is u8           // true
a is Number       // true
e is u64          // true
u64 is TypeInfo   // true
typeof f          // f64

// casting
def float := b as f64  // float is now 0.0

def byte :=  c as u8   // byte is 255 now

// this will throw an integer overflow:
0xffffffffffffffffu64 as i64  // won't eval! too big for i64

def test  := b as Bool    // false
def test2 := a as Bool    // true
def str   := f as String  // "5.0"

Pro Tip: As a recap, in TeaScript all types are values. So, you can store types in variables or (Named) Tuple elements, pass them as function parameters or return them as return values. Link to docu: Types are values!

Bit operators

(TeaScript language feature)

TeaScript now has the common bit operators for binary and, or and xor as well as the unary bit not and also the shift operators for left and right shift of all integral types.

Thanks to the underlying C++20 shifting of signed integers is well defined.

Attempting to shift with a negative or too big number (as the rhs operand) will result in a evaluation error.

// bit operators as usual
def bits := 0x2 bit_or 0x4 bit_or 0x8   // 0xe

const is_set := (bits bit_and 0x2) != 0 // true

bit_not 0     // -1

1u8 bit_lsh 2 // 4

-1 bit_lsh  2 // -4
-32 bit_rsh 2 // -8

// evaluation error: "Bitshift value is too big for operand!"
1u8 bit_lsh 9


The Buffer type

(TeaScript language and C++ Library feature)

General information

Buffers are represented as contiguous memory and can be accessed and modified bytewise at byte boundaries (1 byte = 8 bit, U8 in TeaScript).
Note: It is not possible to share assign (@=) from a single byte of a buffer.

In TeaScript a buffer has the type Buffer, which is a std::vector<unsigned char> in C++. Thus, in C++ you can direct access the buffer and its memory and do all the stuff like you want and need via the most optimal and performant way.
See test_code9() for a full blown C++ example.

The Subscript operator (e.g., buf[ idx ]) can be used for access and modify an existing byte.

Buffers will not grow behind its original capacity automatically (use _buf_resize for grow or shrink), but its size will grow up to its capacity.

As with everything in TeaScript, memory will be freed automatically if the last reference goes out of scope or is being undef‘ed.

Because TeaScript has only one signed integral type, I64, all setter and getter for signed types are operating with I64.
Because TeaScript has only U8 and U64 as unsigned integral types, all getter and setter using a bigger type than U8 are operating with U64 as type.

All getters and setters are operating in host byte order.

Buffer support functions

A buffer is created with _buf( size ).
This creates an empty Buffer (size == 0) with capacity ‘size’.

Use _buf_size( buffer ) to query the actual size (in bytes) of the buffer and _buf_capacity( buffer ) accordingly to query its capacity (amount of allocated memory in bytes).

Use buf_zero( buffer ) to fill the complete capacity of the buffer with zeroes. (After the call the size is equal to the capacity.)
Or use _buf_fill( buf: Buffer, pos: Number, count: Number, val: U8) or _buf_fill32( buf: Buffer, pos: Number, count: Number, val: U64) for a more fine grained way to fill a part or the complete buffer with some values.

Use _buf_resize( buffer, size ) for grow or shrink the buffer. This function will affect size and capacity!

Use _buf_copy( dst: Buffer, dst_off: Number, src: Buffer, src_off: Number, len: Number ) to (partly) copy a source buffer into a destination buffer.

Use _buf_at( buffer, pos ) to get the requested byte as U8. This function will throw an out of range error if pos is out of range.

With the following functions it is possible to access the data of the buffer as different types. Supported are all common integral sizes from 8 to 64 bit as signed or unsigned.
The String related functions only work with valid UTF-8 Strings. The terminating 0 is not part of the buffer content.
The functions are operating in host byte order.
These functions don’t throw on error but returning a Bool with value false.

_buf_get_u8(), _buf_get_u16(), _buf_get_u32(), _buf_get_u64() will interpret the requested data as the corresponding unsigned integral type and return it as the best fitting type for TeaScript.
_buf_get_i8(), _buf_get_i16(), _buf_get_i32(), _buf_get_i64() will interpret the requested data as the corresponding signed integral type and return it as the best fitting type for TeaScript.

All of the following setters will increase the buffer size if there is enough capacity left for the cases the position is equal to the current size (means: append) or if the data to set needs more room as is available from pos to size.

_buf_set_u8(), _buf_set_u16(), _buf_set_u32(), _buf_set_u64() will write the value as the corresponding unsigned type (host byte order).
_buf_set_i8(), _buf_set_i16(), _buf_set_i32(), _buf_set_i64() will write the value as the corresponding signed type (host byte order).

_buf_set_string() will write the given String at given position _without_ the trailing 0. (Tip: Write the length of the string first for an easy read back of the string later.)

_buf_get_string() will read a String from the buffer. It must be valid UTF-8.

_buf_get_ascii() will read a String from the buffer. It must be valid ASCII.

Use readfile() and writefile() to read/write binary data from file to buffer/buffer to file. (These functions are equivalent to the existing readtextfile()/writetextfile() which are handling UTF-8 file content only.)

Buffer example code

See write_image.tea how the buffer support functions are used to write a Bitmap file with TeaScript.

See test_code9() for a C++ example.

Following a very basic example how a proprietary file format could be realized with the buffer support functions.

// some basic usage for a buffer ...

// create a buffer with a capacity of 100 bytes and current size of 0
def buf := _buf( 100 )

_buf_size( buf )        // 0
_buf_capacity( buf )    // 100


// here we could fill all data with zeros with buf_zero(buf)
// then the size will be 100 and data could be stored at every valid index.
// but we use the alternative possibility and append data step by step.

// add some data
_buf_set_u8( buf, 0, 123u8 )   // 123 added as u8 (8 bit unsigned)

_buf_size( buf )        // 1
_buf_capacity( buf )    // 100

// now we can use index 1 for append more data
_buf_set_i64( buf, 1, 0x1337cafe ) // 0x1337cafe added as i64 (64 bit signed)

_buf_size( buf )        // 9
_buf_capacity( buf )    // 100

const str := "Hello World!"

// add the string length first as u64
_buf_set_u64( buf, 9, _strlen( str ) as u64 ) 

// and then the string content
_buf_set_string( buf, 17, str )

_buf_size( buf )        // 29


// now we could write our own proprietary file format
writefile( cwd() % "tea_test.bin", buf, true )


// and read it back later into a new buffer.
def newbuf := readfile( cwd() % "tea_test.bin" )

_buf_size( newbuf )    // 29

// read and print the string content
println( _buf_get_string( newbuf, 17, _buf_get_u64( newbuf, 9 ) ) )  // "Hello World!"


Benchmark ChaiScript VS TeaScript

There is a new benchmark available measuring the speed of filling a 32 bit RGBA Full HD (or UHD) buffer pixel by pixel. The competitors are ChaiScript and TeaScript.

The source code of the benchmark is here: Bench_BufferOverhead.cpp

Relevant TeaScript code

forall( pixel in _seq( 0, width*height - 1, 1) ) {
    _buf_set_u32( buf, pixel * 4, green )
}
_buf_size( buf ) // return sth...

Relevant ChaiScript code

for( var pixel = 0; pixel < width * height - 1; ++pixel ) {
    _buf_set_u32( buf, pixel * 4, green );
}
buf.size(); // return sth ...

Please, note: Neither ChaiScript nor (the actual) TeaScript are a first choice for filling image buffers from the performance perspective. The benchmark is done for compare its relative performance to each other (and later to show improvements performance wise in future versions).

Additionally: Filling an image buffer with a background color will usually never be done pixel by pixel. This is done here only for the benchmark. See write_image.tea how it could be done fast.

Here is the benchmark result in seconds (shorter is faster/better):

Benchmark result filling a 32 bit rgba full hd buffer ChaiScript VS TeaScript
Benchmark result filling a 32 bit rgba full hd buffer ChaiScript VS TeaScript

As you can see, TeaScript is actually more than double as fast as ChaiScript in this discipline which is great already.
But there is also room for further improvements.

This and more benchmarks will be executed again with the next version of TeaScript which will have its own integrated Tea StackVM. (see outlook)

Misc

New String functions

Use strsplit() for split a String into a Tuple with String elements.

Use strjoin() for join a Tuple into a new String.

Use _strfromascii() to get a String for the given U8 ascii character.

See the provided test scripts example_v0.13.tea and corelibrary_test04.tea for usage examples. Hint: There is a SyntaxHighlighting config for Notepad++ for an easier reading of TeaScript code.

New Host Application functionality

In the interactive shell functions are showing now its parameter names if they are implemented in TeaScript or the amount of parameters if they are based on C++ functions. Use debug function_name or :ls vars for view all functions and variables from the global scope.

Use :search str in the interactive shell for search a variable or function in the global scope containing str in its name.

Function parameters are visible now and a search can be performed in the interactive shell of TeaScript.
Function parameters are visible now and a search can be performed.

Added a batch processing mode.
Start the Host Application with --batch <batchfile> to process the given batch file before the interactive shell is launched. Each line in the batch file will be processed as it would be manually typed in the interactive shell. This is useful if always the same preparation work must be done (e.g., changing the working dir, load a file, add a function, etc.).
Use #echo off to turn off every echo by the shell itself (this will not affect print(), etc.). Don’t forget to turn it on again with #echo on!
This can be combined with :silent.
Note: #echo on/off will be only recognized during batch processing!

New C++ functionality

Use ValueObject::InternalType() to get or switch over the first class citizen types of the ValueObject variant. This is useful for a fast switch() and is an alternative for the Visit() member function.

The ValueObject class has now a bunch of AssignValue() member functions for a convenient and lite weight way to set a specific value in the object.

Added AddSharedValueObject() to class Engine / EngineBase for have a way to add ValueObjects which are living outside already.

More Misc

The Parser

The Parser can now be disabled and enabled again during parsing. When disabled it will only parse lines starting with a hash, everything else will be skipped.

Added the parser commands ##enable and ##disable. The latter will disable the parser starting from the actual line, the first will enable it again.

Added ##enable_if keyword [op value] and ##disable_if keyword [op value].
The first (and actually only) keyword is version. It must be used with one of the comparison or equality operators. The value must be a version number of the form major[.minor[.patch]].
If the condition is true the Parser will be either enabled or disabled.

This is useful, for example, if more TeaScript versions must be supported with a different syntax feature set. Then the unsupported syntax code block can be disabled for older versions.

Example: ##disable_if version < 0.14 will disable all following lines for TeaScript versions older 0.14 until a hash line occurs which will enable the Parser again.

Note: This feature is only practically usable starting from the next version, since all older versions are ignoring this parser command.

Refactored the storage layout of Context

The storage layout of the Context class was refactored in a first little step to use now the Collection class of TeaScript for storing all variables and functions.
As a result the biggest bottlenecks and drawbacks are gone now.

An undef of a variable as well as share assign a variable comes now without an additional performance penalty. The first has a speedup of more than factor 20(!) and the latter a factor of 5,x.

Also, the ValueObject is now only stored once (means the share count is usually 1 and not 2).

The improvements can be verified with a benchmark: Bench_VariableLookup.cpp

Breaking changes

Changes in CoreLibrary functionality

_strat() / CoreLibrary::StrAt() will now return strings of length [0..4] depending on the utf-8 code point char amount.
It will always return a full and complete utf-8 encoded glyph. If the wanted pos is in the middle of an utf-8 code point this complete utf-8 code point will be returned.

readtextfile() / CoreLibrary::ReadTextFile() will now do a complete UTF-8 validation of the read input.
Also, it will not throw anymore but return Bool(false).

Changes on C++ API level

Util.hpp is splitted into 3 files: Util.hpp, UtilContent.hpp, UtilInternal.hpp.
Most likely you still only need Util.hpp

ASTNode::Eval() has been made const.

Parser::Num() throws exception::parsing_error instead of std::out_of_range

exception::bad_value_cast now inherits from exception::runtime_error (w. SoruceLocation) instead of std::bad_any_cast

Engine::AddVar/AddConst w. unsigned int overload now adds a U64 instead of I64.
(NOTE: If a future U32/I32 will be added this overloads (and those for int) will change again/as well!)

Tuple is first class citizen now.
This must be taken into account for a visitor applied to ValueObject::Visit.

Visibility of Core Library functions
The following Core Library functions moved up to Level CoreReduced:
    
_f64toi64
    
The following Core Library functions moved up to Level Core:
    
The following Core Library functions moved up to Level Util:
    
The following Core Library functions moved up to Level Full:

Deprecation

The following deprecated parts have been finally removed from this release:

bool Parser::Int( Content & )
Use Integer( Content & ) or Num( Content &, bool ) instead.

func eval( sth_unspecified ) from TeaScript Core Library (use the not confusing _eval( code_string ) variant instead).

The following parts are now deprecated and will be removed in some future release:

Engine::ActivateDeprecatedDefaultMutableParameters()
Please, change your script code to explicit mutable parameters with ‘def’ keyword. More details are available in the comment.

CoreLibrary::DoubleToLongLong()
This was the implementation for _f64toi64. In C++ just use a static_cast!

ArithmeticFactory::ApplyBinOp()|ApplyUnOp()
Please, use ArithmeticFactory::ApplyBinaryOp() and ArithmeticFactory::ApplyUnaryOp() instead.

Context::BulkAdd()
Please, use InjectVars() instead.

Context( VariableStorage const &init, TypeSystem && rMovedSys, bool const booting )
Please, uses a different constructor instead.

For Visual Studio Users

Depending on your project / code size it might be possible that you must add the /bigobj flag to the “Additional Options” settings.

Unfortunately Microsoft still has not set this as the new default and every now and then a developer runs into this annoying issue.

As you can read on StackOverflow the setting can just be activated if the code reaches a specific size.

If you also find it very annoying you might vote or write a comment in the corresponding topic in the Microsoft developer community as I did as well.

More Infos

More information about the TeaScript language is available here:

✯ Overview and Highlights
✯ TeaScript language documentation
✯ Core Library documentation

β˜›β˜›β˜› Try and download TeaScript here. ☚☚☚

What was new?

If you missed the previous releases / blog posts, here is the collection. They are nice for introduce new features and for get an overview.

Release of TeaScript 0.12.0 🌈 – Colored Output, Format String, Forall Loop, Sequences, interactive debugging.
Release of TeaScript 0.11.0 πŸŽ‚ – TOML Support, Subscript Operator, Raw String Literals.
Release of TeaScript 0.10.0 🌞 – Tuples/Named Tuples, Passthrough Type, CoreLib config.

Outlook

The next TeaScript version 0.14 will focus on technical improvements and new architectural components. The biggest feature will be an integrated virtual stack machine, the Tea StackVM.

This will bring (hopefully) some performance improvement for executing the script code. Of course, the β€˜compilation’ will be done automatically under the hood (as in e.g., Python and Lua).
TeaScript will keep its AST evaluation also (especially for the interactive shell) but the new Tea StackVM will be used as the default for executing the script code.
Along with that release I will also publish a benchmark comparing eval VS compile VS ChaiScript VS Python …

The Tea StackVM feature will be also the foundation for single step debugging and fine grained execution possibilities (pause/resume, etc).

Final notes

I am interested in your feedback.

Do you have feature wishes for TeaScript?
Do you like or dislike this current release? Don’t hesitate to let me know your opinion. I would be happy about any feedback. πŸ™‚

3 replies on “Release of TeaScript 0.13.0 πŸ—”

Hi again,
back to Tea for a while…
I am in search of how to invoke a script functions from the c++ host.
I have read through the overview-and-highlights a number of times.
May be I am missing something but I can’t find it.
If so, then the way to achieve this is by executing script code and pushing parameters through shared value objects. This works:

—– Initialize(….) —-
RegisterCallbacks( … );
ValueObject sharedId( Integer( 0 ) );
sharedId.MakeShared();
eng.AddSharedValueObject( “Identity”, sharedId );

—– Run( int identity ) ….
sharedId.AssignValue( Integer( identity ) );
auto valObject = eng.ExecuteCode( script );

——– EntityChanged.tea ——-
uses registered call backs with shared variable “Identity”

————————————–

I can get a function value with this:
ValueObject func = eng.GetVar( “onEntityChanged” );
But I fail to find a way to make good use of it. Can I call that tea script function somehow?

Cheers,
Conrad

Hi Conrad,
thank you very much for your feedback!

You are right, there isn’t a highlevel API yet for nicely invoke TeaScript functions from C++.
For the time being you need the Context instance from your engine (e.g. derive from it and implement a simple inline getter).
Furthermore you must build a std::vector<ValueObject> for the function parameters manually.

With that you can invoke the function similar like this:

ValueObject func = eng.GetVar( "onEntityChanged" );
func.GetValue<teascript::FunctionPtr>() -> Call( engine.GetContext(), parameter_vector, SourceLocation() );

I added this feature to my todo list for the next release.
I imagine something like this:
engine.InvokeFunc( "name", param1, param2, ... );

For your initialize(…) part, I think you could implement it more easy with this:

eng.AddVar( "identity", 0 ); // will be made shared as well.
ValueObject sharedId = eng.GetVar( "identity" );

For big types, like Buffer, the interface does not offer to pass by copy but with move only.
Means, TeaScript takes ownership of the data. But if you need the data at outside also, the only good way is to store it in a ValueObject first and
the use AddSharedValueObject().
For simple types like Integer, Bool, etc. this most likely not need.

Cheers,
Florian

Hi Florian,
thanks, that helped.
I had discovered this from your benchmark code at github, line 284.
I copied part of it but I kept getting exceptions e.g:
> FromParamList ASTNode: Too less arguments!
I simply copied what your code seems to do regarding the context c.
With the engine.mContext, I get further.

As far as the Initialize is concerned, I am still in early stages.
Now that the call works with parameters, I don’t need the shared variable anymore. But I need to structure it to achieve some sort of Eval -> Compile – Wait for it. Then on-events, run-by-calling known script functions. (The initial compile is an opportunity for scripts to make their methods known and make some global variables)

Engine.InvokeFunc(..) sounds nice. May it should be allowed to return a value too. Else the vector params that seem to get pushed from left to right works for me.
Cheers,
Conrad.

Leave a Reply

Your email address will not be published. Required fields are marked *