[vhdl-200x-ft] Re: floating point, thinking out loud

From: Jim Lewis <Jim@SynthWorks.com>
Date: Thu Dec 02 2004 - 16:24:07 PST

David,
> What do you think about this?
As far as I can see, the exponent and mantissa width
are specified with generics. Is there something I am
missing? I am not excited about this.

Does having width of exponent and mantissa constrained
buy us anything? One thing I think you mentioned in the
meeting is that if fp32 is a constrained type and there
are specific subprograms that only handle fp32, then one
can do:
   Y_fp32 <= A_fp32 + "11000000110100000000000000000000" ;

On the other hand, lets suppose that I have fp32 defined
as a subtype of fp and all math operations are implemented
for the type fp.
   subtype fp32 is fp(8 downto -23) ;

Can't I coerce the string literal using a type qualifier
as follows?
   Y_fp32 <= A_fp32 + fp32'("11000000110100000000000000000000") ;

> Will this fill your need for a user definable floating point?
I need to see the trade-offs to really get a handle on the
situation. I am still of the opinion that if we leave exponent
and mantissa unconstrained, we will end up with less package
instantiations and perhaps be able to instantiate most/all
of them in the IEEE library.

Part of my concern is that these packages are the start of
floating point. As silicon area increases, what we do on a
chip will increase. The use model needs to be basic and easy
to do, as with increased silicon area, we will be building
things like complex numbers and matrix math on top of them.

--------------------------------------------------------------
Here is one area I would like to explore. I need to put more
time into it and will not be able to until after the next meeting.
If we use the subtype approach, I speculate that we will also
need a package which defines aliases. For example we will
need to create an alias that points to the IEEE 754/854
standard version of fp32:

Package fphdl_alias is
   alias ieee_fp32 is ieee.fphdl_w_x_y_z.fp32 ;
   . . .
end fphdl_alias ;

Then I am thinking that we can knit it all together
so that using fphdl looks as simple as a package reference:

context fphdl_ctx is
   library ieee ;

   use ieee.fphdl_types.all ;
   -- make all parameterized fphdl packages visible for usage:
   use ieee.fphdl_xxx.all ;
   use ieee.fphdl_yyy.all ;
   ...
   use ieee.fphdl_zzz.all ;

   use ieee.fphdl_alias.all ;
end context ;

Even without the exponent and mantissa widths
specified, this still looks like a large number
permutations for the fphdl package.

Computer based numeric representations are ment to get
good results with fixed width objects. Hence we have
standard definitions for 32 and 64 bit floating point.
Hardware is not constrained to word widths. In
hardware we can use 38 bits if we wish.

Are guard bits extra bits for the ?mantissa? that are
needed to maintain precision for some operations? Is
there some way to take advantage of this and reduce
the number of package permutations we need? After
synthesis if the LSBs of an array always get set to
0, then effectively they are not part of the
logic and this effect will be propagated to simplify
other hardware.

------------------------------------
Going in an entirely different direction, are there
any language enhancements we need to make this more
work able? It sure would be nice if we had a
way to augument a type or subtype with the "other" things
we are specifying with generics, fp_round_style,
fp_denormalize, fp_check_error, and fp_guard_bits.
Effectively this would become an attribute that can
maintains its value through a call to subprogram.

It is really an ironic shame about the attributes
on signals not being passed into the subprogram.
They are not passed because the LRM allows constant
parameters to be passed by value, however, for
array parameters, the LRM also allows them to be
passed by reference. So the bottom line seems to
be that the attributes are not available because
the object may be passed by value although it is
probably passed by reference.

------------------------------------

I should be able to do a more detailed review of what you
are proposing sometime after the next meeting and perhaps
then be able to propose an alternative.

Cheers,
Jim

> Jim:
>
> What do you think about this? Will this fill your need for
> a user definable floating point?
>
> -------- Original Message --------
> Subject: floating point, thinking out loud
> Date: Tue, 30 Nov 2004 11:37:57 -0500
> From: David Bishop <dbishop@vhdl.org>
> To: Jim Lewis <Jim@synthworks.com>, Peter Ashenden
> <peter@ashenden.com.au>, dbishop@vhdl.org
>
> Thinking of this use model:
>
> package fphdl_types_package is
> type round_style ....
> type valid_fpstate ...
> type float is array (integer range <>) of STD_LOGIC;
> subtype fp32 is float (8 downto -23); -- IEEE single precision
> subtype fp64 is float (11 downto -52); -- IEEE double precision
> (any other standard data widths we can think of)
> end package fphdl_types_package;
>
> package fphdl_pkg is
> generic (
> fp_fraction_width : NATURAL;
> fp_exponent_width : NATURAL;
> fp_round_style : round_type;
> fp_denormalize : BOOLEAN;
> fp_check_error : BOOLEAN;
> fp_guard_bits : NATURAL
> );
> subtype fp is float (fp_fraction_width downto -fp_exponent_width);
> .....
>
> end package body fphdl_pkg;
>
> In the IEEE library we would still do:
> package fphdl32_pkg is new IEEE.fphdl_pkg generic map (
> fp_fraction_width => 23; -- 23 bits of fraction
> fp_exponent_width => 8; -- exponent 8 bits
> fp_round_style => round_nearest; -- round nearest algorithm
> fp_denormalize => true; -- Use IEEE extended floating
> -- point (Denormalized numbers)
> fp_check_error => true; -- Turn on NAN and overflow processing
> fp_guard_bits => 3 -- number of guard bits
> );
>
> package use_float is new IEEE.fphdl_pkg generic map (
> fp_fraction_width => 18;
> fp_exponent_width => 8;
> fp_round_style => round_zero; -- truncate
> fp_denormalzie => false;
> fp_check_error => false;
> fp_guard_bits => 0);
>
> use ieee.fphdl32_pkg.all;
> use work.user_float.all;
> archecture RTL of something is:
> subtype fp27 is float (8 downto -18);
> signal a, b : fp27;
> signal x, y : fp32;
> begin
> a <= resize (x, fp27'high, -fp27'low);
>
> The user would have to declare and add the generic package for the
> type before using it. The "fp" type will still be there, but hidden.
> If you create a type without the generic package, you would get a
> "no function for" error from the compiler.
>
> Upside: You can now create your own "float" type. You just have to
> parameterize a package for it.
> Downside: You could do a 32 or 64 bit math package without denormal
> numbers if you created it yourself.
>
Received on Thu Dec 2 16:24:21 2004

This archive was generated by hypermail 2.1.8 : Thu Dec 02 2004 - 16:24:50 PST