Topic: A couple of example routines in the D programming lang - my new love affair (Read 3847 times)

Weaver · « **on:** June 17, 2018, 04:37:06 AM »

Here a a couple of bits of D (google DLANG) for you to enjoy the first one is a one-liner but has a vast mountain of test-code, a lot of it all done at compile-time in fact so it won't build if it has bugs. That is because if D’s amazing CTFE (compile-time function evaluation) whereby it evaluates absolutely everything that is conceivably possible at compile-time and when it can it deletes an entire complex routine just replacing it with the computed result. Of course C can evaluate expressions at compile-time but it seems to be relatively shy about it. The first example just tests to see whether a number is a power of two or not.

Code: [Select]

// dlang d ispowerof2

module test;

import std.stdint;

pure nothrow @safe @nogc 
bool IsPowerOf2(T)( T x )
// Return non-zero if arg is a power of two
// Is currently strictly true or false ret val - despite the choice of implementation with &
in  {
    }
out ( ret )
    {
    assert( ret == true || ret == false ); // strict bool restriction currently
    debug
    	{
	bool b = false;
	//for ( uint s = 0; s <= 8 * x.sizeof -1; s++ )
	foreach( s; 0.. 8 * x.sizeof )
	    	{
	    	b = b || ( x == (1uL << s) );
	    	}
	assert( ret == b );
    	}
    }
body
    {
    return ( ( x & (x - 1) ) == 0 ) & (x > 0); /* Should really be an '&&' */
    /* but using an '&' is in fact completely safe in ANY CASE as both operands contain proper comparisons in them anyway
    // and alternatively the function could be re-specced to be ret non-zero=true.
    // The change to a '&' is just in case a horrid compiler emits branches.
    // GDC and LDC do not do so even with &&, as it happens, a miracle (tested with -O3 in both compilers. */
    }
    
bool t1( int x )
    {
    return IsPowerOf2( x );
    }

bool t1( uint x )
    {
    return IsPowerOf2( x );
    }


bool t1( uint64_t x )
    {
    return IsPowerOf2( x );
    }

bool xxx8()
    {
    const uint x = 8;
    return IsPowerOf2( x );
    }

bool yay5()
    {
    const size_t x = 5;
    return IsPowerOf2( x );
    }

bool zzz0() 
    {
    const size_t x = 0;
    return IsPowerOf2( x );
    }

bool zzz1()
    {
    const uint64_t x = 1;
    return IsPowerOf2( x );
    }
    
    
void do_static_unittests()
	{

	static assert( ! IsPowerOf2( 32 / 4 * (4 +1 ) -1 ) );
	static assert( ! IsPowerOf2( 512 / 2 * (2 +1 ) -1 ) );

		{
		enum n = 6;
		static assert( ! IsPowerOf2( cast(uint)  n ) );
		static assert( ! IsPowerOf2( cast(int)   n ) );
		static assert( ! IsPowerOf2( cast(ulong) n ) );
		static assert( ! IsPowerOf2( cast(long)  n ) );
		}
		{
		enum n = 7;
		static assert( ! IsPowerOf2( cast(uint)  n ) );
		static assert( ! IsPowerOf2( cast(int)   n ) );
		static assert( ! IsPowerOf2( cast(ulong) n ) );
		static assert( ! IsPowerOf2( cast(long)  n ) );
		}
		{
		enum n = 9;
		static assert( ! IsPowerOf2( cast(uint)  n ) );
		static assert( ! IsPowerOf2( cast(int)   n ) );
		static assert( ! IsPowerOf2( cast(ulong) n ) );
		static assert( ! IsPowerOf2( cast(long)  n ) );
		}
	
	static assert( ! IsPowerOf2( 0u ) );
	static assert( ! IsPowerOf2( 0 ) );
	static assert( ! IsPowerOf2( 0uL ) );
	static assert( ! IsPowerOf2( 0L ) );
	
	static assert( ! IsPowerOf2( ~0u ) );
	static assert( ! IsPowerOf2( ~0 ) );
	static assert( ! IsPowerOf2( ~0uL ) );
	static assert( ! IsPowerOf2( ~0L ) );
	
	
		{
		enum i = 30;
		static assert( IsPowerOf2( 1u << i ) );
		static assert( IsPowerOf2( 1 << i ) );
		static assert( IsPowerOf2( 1uL << i ) );
		static assert( IsPowerOf2( 1L << i ) );
		
		static assert( ! IsPowerOf2( (1u << i) + 1) );
		static assert( ! IsPowerOf2( (1 << i ) + 1) );
		static assert( ! IsPowerOf2( (1uL << i ) + 1) );
		static assert( ! IsPowerOf2( (1L << i ) + 1) );
		
		
		static assert( ! IsPowerOf2( (1u << i) - 1) );
		static assert( ! IsPowerOf2( (1 << i ) - 1) );
		static assert( ! IsPowerOf2( (1uL << i ) - 1) );
		static assert( ! IsPowerOf2( (1L << i ) - 1) );
		}
		{	
		enum i = 31;
		static assert( IsPowerOf2( 1u << i ) );
		static assert( IsPowerOf2( 1uL << i ) );
		
		static assert( ! IsPowerOf2( (1u << i) + 1) );
		static assert( ! IsPowerOf2( (1uL << i ) + 1) );
		
		static assert( ! IsPowerOf2( (1u << i) - 1) );
		static assert( ! IsPowerOf2( (1 << i ) - 1) );
		static assert( ! IsPowerOf2( (1uL << i ) - 1) );
		static assert( ! IsPowerOf2( (1L << i ) - 1) );
		}
		
		{
		enum i = 62;
		static assert( IsPowerOf2( 1uL << i ) );
		static assert( IsPowerOf2( 1L << i ) );
		
		static assert( ! IsPowerOf2( (1uL << i ) + 1) );
		static assert( ! IsPowerOf2( (1L << i ) + 1) );
		
		static assert( ! IsPowerOf2( (1uL << i ) - 1) );
		static assert( ! IsPowerOf2( (1L << i ) - 1) );
		}
		
		{
		enum i = 63;
		
		static assert( IsPowerOf2( 1uL << i ) );
		static assert( ! IsPowerOf2( (1uL << i ) + 1) );
		static assert( ! IsPowerOf2( (1uL << i ) - 1) );
		}
	}

A more substantial example which provides access to the Intel x86-64 pext instruction, with a software replacement if the instruction is not there controlled by a version switch. This produces an alternative build controlled by compiler switches. An externally defined #define and an #ifdef would be used in C for this, but D does not have a preprocessor. Every use-case where you would need the preprocessor is dealt with by a raft of specially designed features to cover all the advanced C programmer’s needs. This routine is specific to the GDC compiler a member of the GCC family and it uses a an interfacing facility to integrate assembler code into the program which is very similar to that used by the GCC C compiler. The pext_insn routine shows some assembler which will generate the x64 pext instruction inline optimally if the instruction is available. Older Intel CPUs will not have this instruction and so some longwinded code is generated in that build.

Code: [Select]

import std.stdint;

pure nothrow @nogc @safe
T min( T )( T a, T b) { return a < b ? a : b; }

pure nothrow @nogc @safe
T max( T )( T a, T b) { return a > b ? a : b; }

/+
bool test_ffff()
	{
	bool t = true;
	enum uint64_t v = ~0uL;

	for ( uint64_t m = 0; m <= 0xffff; m++ )
		{
		for ( uint s = 1; s <= m.sizeof * 8 -1; s++ )
			{
			//assert( test_pext_soft( v, m >>> s ) == (test_pext_soft( v, m ) >>> s) );
			}
		}
	return t;
	}
+/

unittest { do_unittest(); }

static
void do_unittest()
	{
	assert ( test_pext( ~0UL, ~0UL ) == ~0UL );
	assert ( test_pext( ~0UL, 0xffff ) == 0xffff );
	assert ( test_pext( ~0UL, 0xf0f0) == 0xff );
	assert ( test_pext( ~0UL, 0x5555 ) == 0xff );
	assert ( test_pext( ~0UL, 0 ) == 0 );
	assert ( test_pext( ~0UL, 5 ) == 3 );
	assert ( test_pext( ~0UL, 9 ) == 3 );
	assert ( test_pext( 0x10 | 2, 0x10 | 2 ) == (2 | 1 ) );
	assert ( test_pext( 0x10 | 0, 0x10 | 2 ) == (2 | 0 ) );
	assert ( test_pext( 0x08 | 4, 0x10 | 2 ) == (0 | 0 ) );
	assert ( test_pext( 0x110 | 2, 0x10 | 2 ) == (2 | 1 ) );
	assert ( test_pext( 0x110 | 1, 0x10 | 2 ) == (2 | 0 ) );
	assert ( test_pext( 0x208 | 4, 0x10 | 2 ) == (0 | 0 ) );

	static assert ( test_pext_soft( ~0UL, ~0UL ) == ~0UL );
	static assert ( test_pext_soft( ~0UL, 0xffff ) == 0xffff );
	static assert ( test_pext_soft( ~0UL, 0xf0f0) == 0xff );
	static assert ( test_pext_soft( ~0UL, 0x5555 ) == 0xff );
	static assert ( test_pext_soft( ~0UL, 0 ) == 0 );
	static assert ( test_pext_soft( ~0UL, 5 ) == 3 );
	static assert ( test_pext_soft( ~0UL, 9 ) == 3 );
	static assert ( test_pext_soft( 0x10 | 2, 0x10 | 2 ) == (2 | 1 ) );
	static assert ( test_pext_soft( 0x10 | 0, 0x10 | 2 ) == (2 | 0 ) );
	static assert ( test_pext_soft( 0x08 | 4, 0x10 | 2 ) == (0 | 0 ) );
	static assert ( test_pext_soft( 0x110 | 2, 0x10 | 2 ) == (2 | 1 ) );
	static assert ( test_pext_soft( 0x110 | 1, 0x10 | 2 ) == (2 | 0 ) );
	static assert ( test_pext_soft( 0x208 | 4, 0x10 | 2 ) == (0 | 0 ) );

	assert ( test_pext_insn( ~0UL, ~0UL ) == ~0UL );
	assert ( test_pext_insn( ~0UL, 0xffff ) == 0xffff );
	assert ( test_pext_insn( ~0UL, 0xf0f0) == 0xff );
	assert ( test_pext_insn( ~0UL, 0x5555 ) == 0xff );
	assert ( test_pext_insn( ~0UL, 0 ) == 0 );
	assert ( test_pext_insn( ~0UL, 5 ) == 3 );
	assert ( test_pext_insn( ~0UL, 9 ) == 3 );
	assert ( test_pext_insn( 0x10 | 2, 0x10 | 2 ) == (2 | 1 ) );
	assert ( test_pext_insn( 0x10 | 0, 0x10 | 2 ) == (2 | 0 ) );
	assert ( test_pext_insn( 0x08 | 4, 0x10 | 2 ) == (0 | 0 ) );
	assert ( test_pext_insn( 0x110 | 2, 0x10 | 2 ) == (2 | 1 ) );
	assert ( test_pext_insn( 0x110 | 1, 0x10 | 2 ) == (2 | 0 ) );
	assert ( test_pext_insn( 0x208 | 4, 0x10 | 2 ) == (0 | 0 ) );
	}


uint64_t test_pext( ulong x, ulong mask ) pure nothrow @nogc @safe
    {
    return pext( x, mask );
    }


uint64_t test_pext_soft( ulong x, ulong mask ) pure nothrow @nogc @safe
    {
    return pext_soft( x, mask );
    }


uint64_t test_pext_soft_0( ulong x ) pure nothrow @nogc @safe
    {
    return pext_soft( x, 0 );
    }

uint64_t test_pext_soft_7( ulong x ) pure nothrow @nogc @safe
    {
    return pext_soft( x, 7 );
    }

uint64_t test_pext_soft_56( ulong x, ulong mask ) pure nothrow @nogc @safe
    {
    return pext_soft( x, 56 );
    }

static
uint64_t test_pext_insn( ulong x, ulong mask ) pure nothrow @nogc @safe
    {
    return pext( x, mask );
    }

// bogus assumption about current target cpu
version = Ver_Pext_Instruction_Available;


//not//version = Ver_Pext_Instruction_Disabled;

version ( Ver_Pext_Instruction_Available )
	{
	version ( Ver_Pext_Instruction_Disabled )	//  if instruction use is force-disabled / overridden
	{
	enum bool use_pext_insn = false;
	}
	else	// instruction use is not force-disabled / overridden
		{
		enum bool use_pext_insn = true;
		}
	}
else	// if cpu does not have the pext insn
	{
	enum bool use_pext_insn = false;
	}


// specialization for all-1s mask

pure nothrow @nogc @safe
TVal_t pext( TVal_t, uint M : ~0u)( in TVal_t x, in uint mask )	{	return x & M;	}

pure nothrow @nogc @safe
TVal_t pext( TVal_t, ulong M : ~0uL)( in TVal_t x, in ulong mask )	{	return x & M;	}

pure nothrow @nogc @safe
TVal_t pext( TVal_t, int M : ~0)( in TVal_t x, in int mask )	{	return x & M;	}

pure nothrow @nogc @safe
TVal_t pext( TVal_t, long M : ~0L)( in TVal_t x, in long mask )	{	return x & M;	}


// specialization for zero mask
pure nothrow @nogc @safe
TVal_t pext( TVal_t, uint M : 0u)( in TVal_t x, in uint mask )	{	return 0;	}

pure nothrow @nogc @safe
TVal_t pext( TVal_t, ulong M : 0uL)( in TVal_t x, in ulong mask )	{	return 0;	}

pure nothrow @nogc @safe
TVal_t pext( TVal_t, int M : 0)( in TVal_t x, in int mask )	{	return 0;	}

pure nothrow @nogc @safe
TVal_t pext( TVal_t, long M : 0L)( in TVal_t x, in long mask )	{	return 0;	}


/* ====== */

pure nothrow @nogc @trusted
TVal_t pext( TVal_t, TMask_t )( in TVal_t x, in TMask_t mask )
	if ( is( typeof( x & mask ) ) )
in
	{
	static assert( is( typeof( x & mask ) ) );
	}
out (ret)
	{
	static assert( is( typeof( ret == (x & mask) ) ) );
	assert( ret.sizeof >= x.sizeof );
	}
body
    {
    if ( __ctfe )	// make sure that we do not prevent ctfe from working
	{
	return pext_soft( x, mask );
	}
    static if ( use_pext_insn )
	{
	// deal with mixed types, as far as possible - the instruction template doesnt handle differing types
	// also optimize tha case where the mask is too narrow, can use a cheaper width op
	static if ( mask.sizeof < x.sizeof )
		{	// mask is narrow, optimize
		static assert( is( typeof( x & mask ) ) ); // the cast can narrow values, but eg will not handle eg floats
		const typeof(mask) narrowed_val = cast(const(typeof(mask))) x;	// narrow the x, also ensure same type to make the insn template match
		const TVal_t ret = pext_insn( narrowed_val, mask );
		}
        else
		{
		static assert( is( typeof( x & mask ) ) ); // the cast can definitely narrow masks, and should, but a cast that is a conversion involving insane types must not be attempted as it would change the mask
		const typeof(x) narrowed_mask  = cast(const(typeof(x))) mask;	// ensure same type to make the insn template match - but it is essential that this merely be a change of width at most
		const TVal_t ret = pext_insn( x, narrowed_mask );
		}
	}
    else	// if not using the pext insn, use routine with a software loop
	{
	const TVal_t ret = pext_soft( x, mask );
	}
    return ret;
    }

version ( Ver_Pext_Instruction_Available )
     {

pure nothrow @nogc @trusted
T pext_insn( T )( in T x, in T mask )
	if ( is( typeof( x & mask ) ) )
in	{
	static assert( T.sizeof * 8 == 32 || T.sizeof * 8 ==64 );	// reqd by insn
	static assert( x.sizeof == mask.sizeof );					// reqd by insn
	static assert( is( typeof( x & mask ) ) );
	}
out (ret )
	{
	static assert( ret.sizeof == x.sizeof );
	static assert( is( typeof( ret == (x & mask) ) ) );
	}
body
    {
    /* Any choices of type conversions, or widening/narrowing would be applied here */
	const T asm_src  = x;
    const T asm_mask = mask;
    T asm_ret;
    /* Checks on the restrictions on the pext instruction's operands' widths */
    static assert( asm_src.sizeof * 8 == 32 || asm_src.sizeof * 8 ==64 );
    static assert( asm_src.sizeof == asm_mask.sizeof );
    static assert( asm_src.sizeof == asm_ret.sizeof );
    asm {
        ".intel_syntax"                      "\n\t"

        "pext    %[ret], %[src], %[mask]"    "\n\t"

        ".att_syntax"                        "\n"
        : [ret]  "=r" (asm_ret)		/* the  format is ret= out r64, src = in r64, mask= in r64 or mem64 or 32-bits for all three instead*/
        : [src]  "r"  (asm_src),
          [mask] "rm" (asm_mask)
        : "cc" ;
        }
    return asm_ret;
    }
} // end version


// ====
// Choice of case optimisation strategy for pext_soft() - optimize the worst case to be flat time, or variable time for mask values that are low, in the hope these are more frequent
enum pext_algorithm  { alg_default, optimize_worst_case, assume_low_values };
alias pext_algorithm_t = pext_algorithm;

pure nothrow @nogc @safe
TVal_t pext_soft( TVal_t, TMask_t, pext_algorithm_t algorithm_preference = pext_algorithm.alg_default )( in TVal_t x, in TMask_t mask )
	if ( is( typeof( x & mask ) ) )
in	{
	}
out (ret) {
	static assert( is( typeof( ret == (x & mask) ) ) );
	}
body
	{
	import std.traits : __traits;
	enum bool is_mask_known_at_compile_time = __traits( compiles, { enum e_ = mask; } );

	// The working accumulator is based on the width of TVal_t narrowed if mask is narrower, then adjusted up for efficient accumulator ops
	enum eff_src_bit_width = min( TVal_t.sizeof, TMask_t.sizeof ) * 8;
	static if ( eff_src_bit_width <= 32 )
		{
		alias TSrcAcc_t = uint32_t;
		}
	else static if ( eff_src_bit_width == 64 )
		{
		alias TSrcAcc_t = uint64_t;
		}
	else static assert( false, "First argument is of a type not supported" );
						static assert( TSrcAcc_t.sizeof >= min( TVal_t.sizeof, TMask_t.sizeof ) );
	alias TRet_t = TSrcAcc_t;
	TRet_t ret = 0;
						static assert( ret.sizeof >= min( TVal_t.sizeof, TMask_t.sizeof ) );
	uint destbitpos = 0;
						static assert( is( typeof( x & mask ) ) );
	alias TMaskAcc_t = TSrcAcc_t; // narrowed as an optimization based on the bits masked off
	TSrcAcc_t src = cast(TSrcAcc_t) x & cast(TSrcAcc_t) mask; // knock out the source bits that are zero-ed out by the mask, saves a src & mask in the loop

	static if ( algorithm_preference == pext_algorithm.assume_low_values || is_mask_known_at_compile_time ) // variable time, optimize hoping for low mask values being frequent - could actually be much worse, as cost of jumps not measured!
		{
		for ( TMaskAcc_t m = mask; m; m >>>= 1 )
			{
			assert( destbitpos >= 0 && destbitpos < src.sizeof * 8 );
			ret |= ( src & 1 ) << destbitpos;	// dont need to do (src & m & 1) as zero-masked bits were already knocked out of src at the start by src = x & mask
			destbitpos += m & 1;
			src >>>= 1;
			}
		}
	else static if ( algorithm_preference == pext_algorithm.optimize_worst_case || algorithm_preference == pext_algorithm.alg_default )
		{	// optimize the worst case, all runtimes are flat (checks for special cases - ie mask = all 1s or 0 - are handled earlier)
		static assert( eff_src_bit_width <= src.sizeof * 8 );
		enum loop_bits = eff_src_bit_width;
		TMaskAcc_t m = mask;	// using a possibly narrowed type, from min() above
		for ( uint i = 0; i < loop_bits; i++ )
			{
			assert( destbitpos >= 0 && destbitpos < src.sizeof * 8 );
			ret |= ( src & 1 ) << destbitpos;	// dont need to do (src & m & 1) as zero-masked bits already knocked out of src at the start by src = x & mask
			destbitpos += m & 1;
			src >>>= 1;
			m >>>= 1;
			}
		}
	else static assert( 0, "no pext_soft implementation algorithm was chosen" );
	assert( ret <= ( mask & x ) );
	return ret;
	}

Weaver · « **Reply #1 on:** June 17, 2018, 04:57:49 AM »

This example shows a lot of powerful features of D, generic programming with templates and function overloading like C++ but the syntax is cleaner. One good thing in D is that all the numeric basic types have fixed, defined bit-widths and can never change according to compiler or CPU architecture, whereas in C basic types can be anything which is very dangerous unless you use only the non-native types defined in the later standards in a header file.

This is a short routine to calculate powers of numbers. Looking at it I am pretty sure I can see some bad bugs, as some bits of it look nonsensical to me now. See if you can find them.

It also features something new of my own which I am adding to the D language courtesy of the power of magic functions provided by the GDC compiler. The likely() and unlikely() functions are hints to the compiler which tell it whether an if statement is likely to be true or the reverse. Therefore it generates far faster code. Depending on the CPU it either sorts out the hinting of the branch taken-vs-not-taken thing by using forward vs backward branches or using special instructions with explicit hints in some CPUs. I have found it extremely valuable in sorting out cases where there are if statements that are ridiculously unlikely, for example protective error checking or bounds checking, stuff that simply slows your routines down a lot. This could equally well be implemented in GCC C too as the low-level compiler magic is common.

Code: [Select]

// program power

alias uint64_t = ulong;

// ~~~ module likely v 1.00 - begin 

version ( GNU )
	version = GDC;
version ( GDC )
	version = compiler_GDC;

bool likely()( in bool test_cond ) pure nothrow @safe
	{
	version ( compiler_GDC )
		{
		import gcc.builtins : __builtin_expect;
		return cast(typeof(test_cond)) __builtin_expect( test_cond, true ); // __builtin_expect returns its own first argument, and actually returns a long
		}
	else
		{ static assert( 0, "what compiler?" );
		return test_cond;
		}
	}

bool unlikely()( in bool test_cond ) pure nothrow @safe
	{
	version ( compiler_GDC )
		{
		import gcc.builtins : __builtin_expect;
		return cast(typeof(test_cond)) __builtin_expect( test_cond, false ); // __builtin_expect returns its own first argument, and actually returns a long
		}
	else
		{ static assert( 0, "what compiler?" );
		return test_cond;
		}
	}
// ~~~ module likely - end.

// +=================

// ~~~ module pow v 0.2 - begin

// unsigned exp
T pow(T, exp_ui_t )( in T x, in exp_ui_t p )
		if ( is ( exp_ui_t == ulong ) || is ( exp_ui_t == uint ) )
in	{
	static assert( is ( typeof( x * x ) ) );
	assert( p >= 0 );
	}
out ( ret ) {
	assert( ( p == 0 && ret == 1 ) || !( p == 0 ) );
	}
body
	{
 	if ( unlikely( p == 0 ) ) return 1;
  	if ( unlikely( p == 1 ) ) return x;

	/*
	if ( unlikely( x == 0 ) )	// fast-path opt, unnecessary
		return x;
	if ( unlikely( x == 1 ) )	// fast-path opt, unnecessary
		return x;
	*/
  	
  	T s = x;
  	T v = 1;
  	
  	for ( exp_ui_t i = p; i > 1; i >>= 1 )
    		{
    		v = ( i & 0x1 ) ? s * v : v;
    		s = s * s;
   		}
  	 //assert( p > 1 && pow( x, p ) == ( p > 1 ? x * pow( x, p-1) : 1) );
  	return v * s;
	}

template ToUnsignedType( T )
	{
	static if ( is (T == int ) || is (T == uint ))
		alias ToUnsignedType = uint;
	else static if ( is (T == long ) || is (T == ulong ))
		alias ToUnsignedType = ulong;
	else static assert( 0 );
	}

// fp values, signed exp
T pow(T)( in T x, in int p )
		if ( is( T == real ) || is( T == double ) || is ( T == float ) )
in	{
	assert( ! (x == 0 && p < 0 ) );
	}
out	{
	}
body
	{
	alias ui_exp_t = ToUnsignedType!p;
	const ui_exp_t u = cast(ui_exp_t) abs( p );
	T r = pow( x, u );
	if ( unlikely( p < 0 ) )
		{
		r = (cast(T) 1) / r;
		}
	return r;
	}

// integer values, signed exp
T pow(T)( in T x, in int p )
		if ( !( is( T == real ) || is( T == double ) || is ( T == float )) )
in	{
	assert( ! (x == 0 && p < 0 ) );
	}
out	{
	}
body
	{
	if ( unlikely( p < 0 ) )
		{	// if integral x
		if ( unlikely( x == 1 ) )
			return 1;
		if ( unlikely( x == -1 ) )
			return -1;
		assert( 0 );
		return 0; // must be unreachable, and in any case, must never drop through below
		}
	/* positive powers-only case */
	assert ( p >= 0 );
	const uint u = p;
	return pow( x, u );
	}
// ~~~ module pow end.		

auto foo( in uint64_t x, in uint p )
	{
  	static assert( pow( 2, 2) == 4);static assert( pow( 2, 3) == 8);
	static assert( pow( 2, 5) == 32);
	return pow( x, p );
	 }

Finally, a directive to the GDC compiler called assume() which tells the compiler that it can assume something is true at compile time, such as a value range restriction on a variable. This is like an assert() but assert()s are only enabled in debug builds whereas assume is always effective. It should make the compiler delete entire if statements where applicable.

Code: [Select]

void assume()( in bool condition ) nothrow @safe @nogc
	{
	debug assert( condition );
	static if ( __traits( compiles, {enum bool test_ = condition; } ))
		{
		static assert( condition );
		}

  	import gcc.builtins : __builtin_unreachable;
  	if (!condition)
  		{
    		__builtin_unreachable();
		}
	}

dstring assume( in dstring condition )
// usage eg mixin( assume( "a > 10" ) );
	{
	return "debug assert( " ~ condition ~ "); import gcc.builtins : __builtin_unreachable; ; if ( !(" ~ condition ~ ") ) __builtin_unreachable();";
	}
	
bool foo( in int a )
	{
  	assume(a > 10);
  	return a > 10;
	}
	
bool bar( in int a )
	{
	mixin( assume( "a > 10" ) );
	return a > 5;
	}

Weaver · « **Reply #2 on:** July 17, 2018, 06:13:27 AM »

A real complete minimal program which actually runs and produces some output. I wrote it as part of learning D and to accomplish a real task. I needed to automate an annoying chore which came up when I needed to add stuff to the config files for my ZyXEL WAPs. The program produces a load of output to stdout that contains a block of calculated mac addresses. It shows:
* output to stdout,
* printf equivalent,
* templates,
* compile-time directives and the replacements for the missing preprocessor,
* ‘contracts’ - optional correctness checking at entry/exit from functions,
* nested functions,
* unsigned/logical shift right [yay!],
* range loops,
* compile-time asserts,
* type casts,
* string concat operator,
* typedefs,
* unicode characters and native unicode strings,
* debug statements,
* sizeof
* returning arrays by value.

Code: [Select]

/* mac_hex.d

	Generates / outputs a calculated snippet of text for inclusion into a ZyXel WAP config file
	which lists (part of) the range of MAC addresses belonging to two Firebricks.

	This consists of a list of lines of Unicode text (sticks to all-ASCII).
	Each line contains human-readable format ASCII hex representation MAC address (48-bit).
	The MAC addresses listed form a whitelist, in this case, and are grouped together into one config file section/paragraph.

	Each address-containing line must start with a space as a prefix, a sort of indent.
	I suspect that the lack of an initial space terminates a group of lines that forms one config file section/paragraph.
	After the MAC address on a line there is an optional description field, a bit like a comment.

	The syntax of the whole line is
		" " <mac-address-hex> [ " description" [ " " [ <arbitrary_comment_text> ] ] ] $NewLine
	
	The MAC addresses are in the usual ASCII hex format, and I asssume they must be colon-separated for ZyXel.

	Currently the content describes to the possible MAC addresses of two Firebricks,
	so there are two groups of lines, and this pair together forms a single config section with no gap.
	The MAC addresses listed are just a small part of the full range that should be given,
	each per-Firebrick subsection currently lists (arbitrarily) only the first n contiguous MAC addresses, just to keep the file size reasonable.
	Each subgroup starts at a base address that happens to be constrained to a multiple of eg 2**(48-40) = 256 in one case, 2**(48-38) = 1024 in another.
	This is because the full address ranges are defined by an address prefix length.
*/

import std.stdint;


/* typedef */
alias my_char_t = dchar;

/* typedef */
alias mac48__base_t = uint64_t;
/* typedef */
alias mac48_t = mac48__base_t;		static assert( mac48_t.sizeof * 8 >= 48 );


/* typedef */
enum CharCase_t { upper, lower };

/* typedef */
alias HexCharCase_t = CharCase_t; // a subtype


/* typedef */
alias mac48_mask_t = mac48__base_t;
alias mac48_range_t = mac48__base_t;

enum limit_total_mac_addresses_output = 32; // this value is a complete guess, safe, but could presumably be increased a lot  - for some reason the total size or number of entries in that config section must be quite low otherwise the stupid wap crashes


void main()
	{
	uint total_mac_addresses_output = 0;
	/* section for router alternative #1 : firebrick fb2500 */
		{
		enum mac48_t fb_base_mac_address = 0x000397_210D00uL; 
		enum uint    fb_list_n_mac_addresses = 16;		// value for some reason must be quite low otherwise the stupid wap crashes
		enum dstring fb_section_desc = "Firebrick FB2500 ";
		enum uint    fb_prefix_bitlen = 40; // ie a /nn prefix
	
		print_device_section!( fb_base_mac_address,
			               fb_section_desc,
			               fb_prefix_bitlen, // ie a /nn prefix
			               fb_list_n_mac_addresses
			             );
		total_mac_addresses_output += fb_list_n_mac_addresses;
		}
			assert( total_mac_addresses_output > 0 );
	/* section for router alternative #2: firebrick fb2700 */
		{
		enum mac48_t fb_base_mac_address = 0x000397_399C00uL; // 00:03:97:39:9C:00/38 Firebrick FB2700 replaced 2017-09
		enum uint    fb_list_n_mac_addresses = 16;
		enum dstring fb_section_desc = "Firebrick FB2700 - 2017-09";
		enum uint    fb_prefix_bitlen = 38; // ie a /nn prefix
		
		print_device_section!( fb_base_mac_address,
			               fb_section_desc,
			               fb_prefix_bitlen, // ie a /nn prefix
			               fb_list_n_mac_addresses
			             );
		total_mac_addresses_output += fb_list_n_mac_addresses;
		}	
			assert( total_mac_addresses_output > 0 );
	if ( total_mac_addresses_output > limit_total_mac_addresses_output )
		{
		import std.stdio : writefln;
		
		writefln( "total_mac_addresses_output (%u) > limit_total_mac_addresses_output (%u)", total_mac_addresses_output, limit_total_mac_addresses_output );
		assert( 0, "total_mac_addresses_output > limit_total_mac_addresses_output" );
		}
	}

private @safe
void print_device_section( mac48_t base_mac_address,
	                   dstring section_desc,
	                   uint    prefix_bitlen, // ie a /nn prefix
	                   uint    list_n_mac_addresses
	                 )()
	{
	import std.conv : to;
	
	enum mac48_mask_t  mask = ( (cast(mac48_mask_t) 1uL ) << (48 - prefix_bitlen) ) - 1;
	enum mac48_range_t complete_address_range =  (cast(mac48_range_t) 1uL ) << (48 - prefix_bitlen);
	static assert( (base_mac_address & mask) == 0 ); // properly aligned
	static assert( list_n_mac_addresses <= complete_address_range );

	enum prefixed_section_desc = "/" ~ to!(dstring)( prefix_bitlen ) ~ " (" ~ to!(dstring)( complete_address_range )  ~ ")" ~ ( section_desc.length ? " - " ~ section_desc : section_desc );
	enum mac_strings = gen_text_mac_addresses!( list_n_mac_addresses )( base_mac_address );
	print_mac_address_lines( mac_strings, prefixed_section_desc );
	}


template mac48()
	{
	enum HexCharCase_t HexCharCaseDefault = CharCase_t.lower;
	enum HexCharCase_t HexCharCase = HexCharCaseDefault;
	
	enum bool useSeparator = true;
	static if ( useSeparator ) 
		{
		enum my_char_t mac_group_sep_char = ':';	// n/a if no byte separator character
		}
	
	enum nbytes_mac = 48/8; // 48 bits of hex
	enum mac_digs_per_group = 2;		// width of '99' 2 dig group
	
	enum mac_digs_plus_sep = mac_digs_per_group + ( useSeparator ? 1 : 0 );	// width of '99' == 2 dig, or ':99' == 3 dig
	enum mac_n_groups = nbytes_mac * 2 / mac_digs_per_group;
	
	static assert( mac_digs_per_group >= 2 && mac_digs_per_group <= nbytes_mac * 2 );
	static assert( mac_digs_per_group % 2 == 0 );
	static assert( mac_n_groups * mac_digs_per_group == nbytes_mac * 2 );
	static assert( (nbytes_mac * 2) % mac_digs_per_group == 0);
	static assert( (nbytes_mac * 2) % mac_n_groups == 0);
	
	enum mac_chars = mac_n_groups * mac_digs_plus_sep - ( useSeparator ? 1 : 0 ); 	// "99:99:99:99" - one less sep
	
	/* typedef */
	alias mac_string_t = my_char_t[ mac_chars ];
	
	private pure nothrow @safe @nogc
	mac_string_t mac48_to_str( in mac48_t val )
		{
		mac_string_t result_str;
		
		static assert( val.sizeof >= nbytes_mac );
				
		static assert( mac_string_t.length == 17 ); // number of elements not bytes
		for( size_t i = 0; /* 0 to (nbytes_mac-1) incl */; i++ ) {
			debug assert( i * mac_digs_plus_sep + mac_digs_per_group - 1 < mac_string_t.length );
			debug assert( i * mac_digs_plus_sep + 1 < mac_string_t.length );
			
			const shift = (nbytes_mac-1 - i) * 8;		debug assert( shift >=0 && shift <= val.sizeof * 8 - 8 );
			
			/*
			(*p_result_str)[ i * mac_digs_plus_sep + 0 ] = digit_to_hex_ascii!( my_char_t, HexCharCase )( (val >>> (shift+4)) & 0x0f );
			(*p_result_str)[ i * mac_digs_plus_sep + 1 ] = digit_to_hex_ascii!( my_char_t, HexCharCase )( (val >>> (shift+0)) & 0x0f ); 
			*/  
			
			my_char_t[2] two_dig = byte_to_hex_ascii_table!( my_char_t, HexCharCase )( (val >>> shift) & 0xff );
			result_str[ i * mac_digs_plus_sep + 0 ] = two_dig[0];
			result_str[ i * mac_digs_plus_sep + 1 ] = two_dig[1];
			
			static assert( mac_digs_per_group == 2 );
			static assert( mac_digs_plus_sep == 2 + ( useSeparator ? 1 : 0 ) );
		
			if ( i >= nbytes_mac - 1 ) // dont append a final sep ( no room anyway! )
				break;	
					
			static if ( useSeparator ) {
				debug assert( i + 2 < mac_string_t.length );
				debug assert( i + mac_digs_plus_sep - 1 < mac_string_t.length );
		
				result_str[ i * mac_digs_plus_sep + mac_digs_per_group ] = mac_group_sep_char;
				}
			}
		return result_str;
		} /* end func */
		
	} /* end template */

/* typedef */
alias mac_string_t = mac48!().mac_string_t;

alias mac48_to_str = mac48!().mac48_to_str;

private pure nothrow @safe @nogc
T[2] byte_to_hex_ascii_table( T, HexCharCase_t HexCharCase )( in uint8_t byte_val ) // 0..0xff
in	{
	static assert( __traits( isUnsigned, byte_val ) );
	debug assert( byte_val <= 0xff );
	}
body
	{
	alias table_entry_2x8_t = uint16_t; // typedef
	static assert( table_entry_2x8_t.sizeof == 2 );
	
	alias table_t = table_entry_2x8_t[256];
	
		table_t
		init_table()  pure nothrow @safe @nogc	// ctfe
			{
			table_t ret; // uint16_t[256]
			foreach( i; 0..256 )
				{
				const ascii_hi = digit_to_hex_ascii_cmov!( uint, HexCharCase )( (i >>> 4) & 0x0f );
				const ascii_lo = digit_to_hex_ascii_cmov!( uint, HexCharCase )(  i & 0x0f );
				const w_16 = ( ascii_hi << 8 ) + ascii_lo;				debug assert( w_16 <= 0xffff );
				ret[ i ] = cast(table_entry_2x8_t) w_16;
				
				static assert( ret[0].sizeof == 2 );
				}
			return ret;
			} // end fn

	enum table_t byte_to_asciihex_2x8_lookup  = init_table();	// unnecesssary enum forces use of ctfe 
	static immutable table_t static_lookup_table = byte_to_asciihex_2x8_lookup;	// gdc optimiser bug, unnecessary static prevents insane whole array copying to stack [!] madness possibly depends on certain compiler switches used

	const uint val_asciihex_2x8 = static_lookup_table[ byte_val ];
					static assert( val_asciihex_2x8.sizeof >= 2 && val_asciihex_2x8.sizeof >= table_entry_2x8_t.sizeof );

	T[2] ret;			static assert( ret.length == 2 );
	ret[0] = val_asciihex_2x8 >>> 8;
	ret[1] = val_asciihex_2x8 & 0xff;
	return ret;
	}

private pure nothrow @safe @nogc
T[2] byte_to_hex_ascii_cmov( T, HexCharCase_t HexCharCase )( in uint8_t byte_val ) // 0..0xff
in	{
	static assert( __traits( isUnsigned, byte_val ) );
	debug assert( byte_val <= 0xff );
	}
body
	{
	const hi4 = (byte_val & 0xff) >>> 4;
	const lo4 =  byte_val & 0x0f;
	T[2] ret;
	ret[0] = digit_to_hex_ascii_cmov!( T, HexCharCase )( hi4 );
	ret[1] = digit_to_hex_ascii_cmov!( T, HexCharCase )( lo4 );
	return ret;
	}

private pure nothrow @safe @nogc
T digit_to_hex_ascii_cmov( T, HexCharCase_t HexCharCase )( in uint digit_val ) // 0..15
in	{
	static assert( __traits( isUnsigned, digit_val ) );
	static assert( T.sizeof == 1 || T.sizeof >= uint.sizeof );
	static assert( digit_val.sizeof == 1 || digit_val.sizeof >= uint.sizeof );
	debug assert( digit_val <= 0x0f );
	}
body
	{
	enum ascii_Aa = (HexCharCase == CharCase_t.upper) ? 'A' : 'a';
	return cast(T)( digit_val + ( (digit_val < 10) ? '0' : ascii_Aa - 10 ) );
	}


private @trusted @nogc
mac_string_t[ n_mac_strings ]
gen_text_mac_addresses( uint n_mac_strings )( in mac48_t base_mac_addr )   	 /* should be 48 bits */
in	{
	static assert( __traits( isUnsigned, base_mac_addr ) );
	static assert( base_mac_addr.sizeof >= mac48!().nbytes_mac );
	}
body
	{
	mac_string_t[ n_mac_strings ] ret;

	for ( size_t i = 0; i < n_mac_strings; i++ )
		{
		assert( i < ret.length );
		ret[ i ] = mac48_to_str( base_mac_addr + i );
		}
	return ret;
	}

private @trusted
void print_mac_address_lines( mac_strings_t )( in mac_strings_t mac_strings, in dstring trailing_text )
	{
	import std.stdio : write, writeln;
	
	static assert( mac_strings_t.length > 0 );
	assert( mac_strings.length > 0 );
	
	write( " " );	// requires 1 space as an indent, at start of line
	write( mac_strings[ 0 ] );
	if ( trailing_text.length > 0 )
		{
		write( " description " );
		writeln( trailing_text );
		}
	else	{
		writeln();
		}
	for ( size_t i = 1; i < mac_strings.length; i++ )
		{
		write( " " );	// requires 1 space as an indent, at start of line
					assert( i < mac_strings.length );
		writeln( mac_strings[ i ] );
		}
	}

Weaver · « **Reply #3 on:** July 16, 2020, 05:51:31 PM »

I’m convinced there are some hellish bugs in the assume() routine above. Here’s a beta version 2.0, which still needs more testing.

Code: [Select]

/* assume( bool cond ) v2.0
 * Declares the given condition to be true, so that the compiler can optimise later on given this knowledge.
 * In debug builds, it assert()s the truth of the condition to ensure that it is no lie.
 * In release builds, the behaviour is undefined if the condition is a lie.
 *
 * Implemented as a trivial template just to absolutely enforce inlining; inlining is absolutely essential.
 */
pragma(inline)
public
void  assume()( in bool condition ) pure nothrow @safe @nogc
	{
  	if ( !condition )
  		{
		assert( condition );
    		version (LDC)
			{
			// __builtin_unreachable() is not available in LDC ?
			}
		else version (GNU)
			{
			import gcc.builtins : __builtin_unreachable;
			__builtin_unreachable();
			}
		}
	}


dstring assume( in dchar[] condition ) pure @safe
// usage eg mixin( assume( "a > 10" ) );
	{
	return	"assume( " ~ condition ~ "); " ;
	}

	
bool foo( in int a )
	{
  	assume(a > 10);
  	return a > 10;
	}
	
bool bar( in int a )
	{
	mixin( assume( "a > 10" ) );
	return a > 5;
	}

sevenlayermuddle · « **Reply #4 on:** July 16, 2020, 06:49:04 PM »

Looks interesting but these days my code writing indulgences seem to be pulling me in the other direction, downwards, to assembler again. Microchip PIC to be precise, resurrecting some of my old electronics skills. Making gadgets that do things albeit, in my case, things for which the world has no strict need.

There are some tricks that can really only be done in assembler. I wanted an approx 1mS ‘sleep’ the other day, but I didn’t want to use the hardware timers. I chose to obtain it by setting the system oscillator to obtain an instruction cycle of 8000Hz then inserting the right number (8) of NOP (no-op) instructions. Not many high level languages would do that so easily, or so cheaply... a timer, in 8 words of program memory, and using zero bytes of RAM - or 1 byte RAM if you want to add a loop counter for multiples of 1mS.

When I finish my lawn mowing and resume PIC’ing, I am at a point where I want fast divide, where the divisor is always small (1-11). This PIC has no divide/multiply but I reckon I can implement it in 12 words of program memory, using no RAM, and a code-path of two instructions for every case.

D does look interesting though. If my brain were still capable of learning two things at once, I’d happily play with my PICs in the morning, then D in the afternoons.

Weaver · « **Reply #5 on:** July 16, 2020, 07:36:23 PM »

Updated version of the likely/unlikely module : v2.0b which now supports LDC as well as GDC, but the LDC compiler doesn’t seem to work in this respect: there’s no bug, but it just doesn’t deliver the speed optimisation which is the whole point of the thing.

The likely()/unlikely() routines are implemented as trivial templates solely in order to help force inlining which is absolutely essential. With GDC, but unfortunately not with LDC which is broken, using these routines forces the compiler to make speed-conscious decisions about branch taken vs not taken for if-statement basic blocks. LDC just generates its usual good code unaware of the hinting. LDC does not generate evil code, it does no harm. DMD support is tbc.

Code: [Select]

// ~~~ module likely : v 2.00b - begin 
version ( GNU )
	version = GDC;

version ( DigitalMars )
	version = DMD;


pragma(inline)
private
bool builtin_expect()( in bool test_cond, in bool expected_cond )  pure nothrow @safe @nogc
	{
	version ( LDC )
		{// ldc.intrinsics.llvm_expect - doesnt seem to work at the moment, tested in LDC 1.22
		import ldc.intrinsics : llvm_expect;
		return cast(bool) llvm_expect( test_cond, expected_cond );
		}
	version ( GDC )
		{
		import gcc.builtins : __builtin_expect;
		return cast(bool) __builtin_expect( test_cond, expected_cond ); 
		}
       version ( DMD )
		{
		return test_cond; // not supported
		}
	return test_cond;
	}

pragma(inline)
public
bool likely()( in bool test_cond ) pure nothrow @safe @nogc
/* Returns test_cond which makes it convenient to do assert( unlikely() )
 * Also emulates builtin_expect's return behaviour, by returning the argument
 */	{
	return builtin_expect( test_cond, true );
	}



pragma(inline)
public
bool unlikely()( in bool test_cond ) pure nothrow @safe @nogc
/* Returns test_cond which makes it convenient to do assert( unlikely() )
 * Also emulates builtin_expect's return behaviour, by returning the argument
 */
	{
	return builtin_expect( test_cond, false );
	}
// ~~~ module likely - end.

Weaver · « **Reply #6 on:** July 16, 2020, 08:39:13 PM »

One great thing about D is the close bond with C; if you know C then the basics of D are immediately familiar - expressions, control structures. Another great thing is that D can call C with zero hassle and there’s now a D compiler flag -betterC which enforces C compatibility in D code so the D code bolts into an otherwise all-C project with no hassle.

There is a superb tutorial book available on Kindle from amazon by Ali Çehreli. If you like me love writing assembler language, you can have a wondrous time writing assembler inking in D code, because the D compilers are highly optimising and they can propagate optimisations into your asm. For example, if you pass a value into your asm and it turns out that that value can be evaluated at compile time by the compiler, then the asm subsystem can shove an immediate operand into an instruction instead of a variable which is in a register or in memory accessed by some general addressing mode. The pext() routine earlier on shows some asm in D. It gives D programmers access to the x86 pext instruction inline and for older CPUs that lack pext insns I have given an all-D replacement for a pext inns and made it as fast as I possibly can.

sevenlayermuddle · « **Reply #7 on:** July 16, 2020, 10:49:27 PM »

Not wishing to divert from this riveting thread, but I’d just point out that reference to ‘assembly language’, and comparing to high level languages is somewhat ambiguous unless we specify which assembler, and which architecture and instruction set. Many different varieties, some very different to others, each presenting different challenges and rewards to the code writer.

I wrote a fair bit of 68k assembly professionally, loved it. Also a bit of x86, hated it. Nearly had to write some PowerPC once, looked scary, was glad that project fell through. Currently playing with PIC assembler purely for fun, seventh heaven.

Probably a bit pedantic though Weaver, please continue to enthuse on D.

Weaver · « **Reply #8 on:** July 17, 2020, 05:34:07 AM »

The D GDC and LDC compilers can handle various machines’ assembly languages embedded in the middle of the D. A particular compiler knows that it is one particular processor that is the target and has machine-specific knowledge which plays a small part in the glue / bindings between D and asm. For example the compiler knows what the registers are called and they can be mentioned in the interface to asm, eg for x86-64 it knows about eax, rax, ax, ah, al and rsi, esi, si, sil. Umpteen processors are supported. Currently stick to the GDC compiler for now if writing inline asm as the LDC compiler has a few annoying things missing and its documentation in this area is poor.

If you go to this stunning website you can look at the code generated by the two major D compilers for umpteen processors. You can try a dozen different languages’ compilers times n processors each: C, D, Fortran, Ada - goodness knows how many. See:
https://gcc.godbolt.org/
pull down the drop-down menu to select D or whatever language you wish. For D using the LDC compiler - default in the compilers list: enter in the compiler switches the following for LDC:
-O3 -release -mcpu=native
that is for maximum optimisation for speed, release (not debug) build. And for D with the GDC compiler
-O3 -frelease -march=native

You will get better code still if you specify the (compiler-specific) switches that assume a latest model CPU so latest insns are assumed to be available and generated code will take advantage of these. Also fast-math and fma support switches for much faster but - in some rare cases - dodgy floating point.

I wrote a small amount of 68k, loved it; wrote 8086, z80 and Hitachi 6301 professionally - a lot, full time, then after five years of asms x various I switched to C.

News:

Author Topic: A couple of example routines in the D programming lang - my new love affair (Read 3847 times)

Weaver

A couple of example routines in the D programming lang - my new love affair

Weaver

Re: A couple of example routines in the D programming lang - my new love affair

Weaver

Re: A couple of example routines in the D programming lang - my new love affair

Weaver

Re: A couple of example routines in the D programming lang - my new love affair

sevenlayermuddle

Re: A couple of example routines in the D programming lang - my new love affair

Weaver

Re: A couple of example routines in the D programming lang - my new love affair

Weaver

Re: A couple of example routines in the D programming lang - my new love affair

sevenlayermuddle

Re: A couple of example routines in the D programming lang - my new love affair

Weaver

Re: A couple of example routines in the D programming lang - my new love affair