Everything Functional

everythingfunctional Dec 5, 2022

The other week I attended The International Conference for High Performance Computing, Networking, Storage, and Analysis (commonly known as Super Computing or SC). It was a great conference, I had a great time and learned a ton, but I noticed an interesting theme that I didn’t really see anybody else mention. I thought I’d at … Continue reading Time for a Cluster OS? →

Show full content

The other week I attended The International Conference for High Performance Computing, Networking, Storage, and Analysis (commonly known as Super Computing or SC). It was a great conference, I had a great time and learned a ton, but I noticed an interesting theme that I didn’t really see anybody else mention. I thought I’d at least write it down.

One of the main events of the conference was Jack Dongara receiving the Turing Award. He is the maintainer of the Top 500 list (the list of the world’s fastest supercomputers). He talked about the linear algebra library he wrote (Linpack) and the benchmark that uses it for measuring the performance of a supercomputer, and thus defines the Top 500 list. He noted how many of the supercomputers were achieving near theoretical peak performance using that benchmark, and even noted that it was possible to rank based on energy efficiency as well. But then he noted that, most modern problems don’t match this type of benchmark, and put up a slide measuring the performance of these computers for a more “realistic” type of problem. At best they were achieving <2% of their theoretical performance. So it seems we’ve been measuring and optimising for the wrong type of problem. But why and how is that?

The Linpack benchmark that’s been used for the last several decades is just a simple matrix multiplication. The memory access pattern is highly regular and well defined, and thus very easy to split across machines for parallel execution. But most “real world” problems aren’t like this. For many problems the memory access pattern is not very regular, or even worse, the amount and types of calculations to be performed aren’t even known without having performed previous calculations. This makes it very difficult to efficiently split the data and calculations across machines. Unfortunately, our supercomputers have been built assuming calculation speed was the bottleneck based on observations from the benchmark used to measure their performance, but it turns out the movement of data around the machines is the actual bottleneck for most problems.

This idea was supported by a major theme of many of the talks I attended. They were all about data transfer libraries (things like MPI), alternatives to using those libraries directly, task scheduling libraries and frameworks, design patterns to keep data in certain places longer, faster networking hardware, and more of these same kinds of things. All of these technologies and advancements trying to solve the fundamental problem, “How do we coordinate data storage and movement so that it’s efficiently accessible to the calculations being performed”?

Now I’m not suggesting that this isn’t a hard problem to solve, especially efficiently for a large class of problems, but it seems like the kind of problem that’s already been solved pretty effectively at a smaller scale. The problem is simple, it’s inefficient for data to be stored far from the place that calculations will be performed on it, so let’s put the data closer when it’s needed and move it away when it’s not needed to make room for the next set of data and operations. The hard part is knowing which data will be needed when and where. But that’s exactly the problem that seems to have been solved by modern operating systems when it comes to the various levels of cache on a multi-core processor versus the main system memory. Our modern desktop machines are somewhat simpler, smaller scale versions of large supercomputers in a way. Each core has some of it’s own cache, there is some shared cache between them, and there is system memory that’s inefficient to access directly. It’s not perfect, but most operating systems have figured out ways to predict what data should be moved to what place in the system and when pretty effectively. There are some software design patterns that tend to let these predictions be more accurate, and occasionally failure to follow them leads to noticeable performance differences, but most software developers don’t seem to have to think about it anymore.

I’ll admit that a supercomputer does add another layer of complexity onto this. There are more pathways for data movement, a more complex measure of “distance” between processor and memory, and these days a larger variety of processors in the system, but it’s fundamentally the same problem that OSes have already solved. So why doesn’t a supercomputer take advantage of this existing solution? Because a supercomputer isn’t running one operating system! Each node in a supercomputer is running it’s own operating system, and thus doesn’t have a view into what other nodes of the system have in terms of data, memory and processors. But why? It seems to me if there was a single unified model and view of the memory, processors and pathways in a supercomputer, i.e. a single operating system, it would be possible to take advantage of the existing solutions to memory and data management.

What do you all think? Is there some fundamental issue with this idea? Has it been tried and failed? Are we already doing this to a degree and I just didn’t know? What would be required to explore this idea? Would it mean a radical change in the architecture of supercomputer systems?

http://everythingfunctional.wordpress.com/?p=355

Extensions

The State of Fortran Generics

everythingfunctional Jul 25, 2022

I just returned from the joint WG5/J3 meeting (the international and US committees in charge of producing the next revision to the Fortran standard). The Generics subgroup, of which I am a contributing member, had a very successful showing. The committee discussed and passed 4 “Specification” papers regarding the template feature slated for inclusion in … Continue reading The State of Fortran Generics →

Show full content

I just returned from the joint WG5/J3 meeting (the international and US committees in charge of producing the next revision to the Fortran standard). The Generics subgroup, of which I am a contributing member, had a very successful showing.

The committee discussed and passed 4 “Specification” papers regarding the template feature slated for inclusion in the 202Y revision of the standard. The combination of the papers provide a complete description of the expected semantics of the feature. In this post I will try to summarize and demonstrate with some examples, what this feature will enable, and likely look like.

NOTE: The exact syntax has not been decided. There are many keywords and syntax elements that are still in debate, but the general structure should not change much. It’s entirely likely that the examples shown below will not work unmodified when the standard is finally published.

The four papers passed, and that I will try to summarize in order, define the semantics for

A new template construct, including where it may appear, and what may appear within it
The scoping rules for templates and their instantiations
A new instantiate statement
A new restriction block and requires statement

Template Construct

The first thing of note is that a new construct will be available. This will be used to define a template, with specific things being “parameterized”. The things allowed to be template parameters are:

types
procedures
integer, logical or character constants (Note that characters must be assumed length, and arrays must be assumed size or assumed rank)

A template may appear in the specification section of a

module
submodule
template

A template can then contain any valid Fortran that could be found within a module. I.e. it can define new types, variables, constants and procedures following a contains statement. There is one caveat. All operations and procedure invocations within a template must have explicit interfaces, and for the purposes of checking those interfaces, all deferred types (types that are template parameters) are treated as completely unique. This has the implication that entities of deferred type; can only be assigned to entities declared to be of the same deferred type, can only be passed as actual arguments to procedures who’s corresponding dummy argument is declared to be of the same deferred type. The consequences of such a constraint is that it can be determined a priori that a template will be valid for all actual parameters. The following, somewhat contrived, example illustrates the intended behavior.

type :: u
  ...
end type
...
template tmpl(T, C, S)
  type, deferred :: T
  character(len=*), constant :: C
  interface
    subroutine S(x, y)
      type(T), intent(inout) :: x, y
    end subroutine
  end interface
contains
  function f(x, i)
    type(T), intent(in) :: x
    type(u), intent(in) :: i
    type(T) :: f
    ...
  end function
  subroutine foo(x, u1)
    real, intent(inout) :: x
    type(u), intent(in) :: u1
    type(T) :: t1, t2

    call S(t1, t2) ! Valid
    call S(t1, x)  ! Invalid; x not declared to be of type T
    t1 = f(t2, u1) ! Valid
    x = f(t1, u1) ! Invalid; cannot assign T to real
    t1 = t2 + t2 ! Invalid; + not defined for T + T
  end subroutine
end template

Scoping Rules

The scoping rules for templates are relatively straightforward at this point. A template has host association (i.e. it has access to any entities available in the scope in which it is defined). An instantiation of a template brings into scope only those entities within the template that are public. Thus it will likely be best practice for templates, as many consider it to be for modules, for a template to begin

template tmpl(...)
  private
  public :: ...
  ...
end template

as well as for instantiations to make use of the only clause. I.e.

instantiate tmpl(...), only: ...

Instantiation

A new instantiate statement provides actual parameters for a template, and makes the template entities available. It also provides a rename mechanism to alleviate any potential name conflicts, and an only clause to limit those entities actually brought into scope. There is some complication involved in the underlying mechanism for this however. An instantiate statement is said to refer to an instantiation, and that instantiate statements with identical actual parameters are said to refer to the same instantiation.

The main benefit to this is that types declared within a template and then instantiated in multiple places are the same type where the rules of Fortran might otherwise consider them to be different types. It also means that procedures with save variables, while not advised, will behave as expected when used in separate places (i.e. the “separately instantiated procedures” will refer to the same saved variable). It will be possible to override this behavior, so that an instantiate statement can produce an entirely separate and unique instance. An example to illustrate is shown below.

template wrapper_tmpl(T)
  type, deferred :: T
  type :: wrapper
    type(T) :: wrapped
  end type
end template

instantiate wrapper_tmpl(real), real_w => wrapper
instantiate wrapper_tmpl(real), other_w => wrapper
instantiate, unique :: wrapper_tmpl(real), w1 => wrapper
instantiate, unique :: wrapper_tmpl(real), w2 => wrapper

! Types real_w and other_w are the same type.
! Type w1 is a different type than all of real_w, other_w and w2
! Type w2 is a different type than all of real_w, other_w and w1

For well designed templates and libraries, users shouldn’t have to think about these complexities most of the time. For compiler writers however, this complexity could be tricky. An instantiate statement may need to refer an instance produced by a previously created instantiate statement, in which case it needs to somehow find it, or it may need to create one in such a way that other instantiate statements will be able to find it.

Restriction and Require

Because certain combinations of template parameters are likely to be common, and have meaningful names, another new block and statement have been added to facilitate the naming and reuse of certain template parameter declarations. A restriction block is a new construct, with a name and template parameters, that can contain declarations of template parameters. A requires statement can then appear in a template or restriction block to “include” those declarations. An illustrative example is shown below.

restriction binary_op(T, U, V, binop)
  type, deferred :: T, U, V
  interface
    function binop(x, y) result(z)
      type(T), intent(in) :: x
      type(U), intent(in) :: y
      type(V) :: z
    end function
  end interface
end restriction
...
template tmpl(T, binop)
  requires binary_op(T, T, real, binop)
contains
  function path_length(steps)
    type(T), intent(in) :: steps(:)
    real :: path_length

    integer :: i

    path_length = 0
    do i = 1, size(steps)-1
      path_length = path_length + binop(steps(i), steps(i+1))
    end do
  end function
end template

Conclusion

The semantics of the new template feature have now been established. The committee still has work to do on finalizing the syntax and then making appropriate edits to the standard, but the basic structure and behavior is now clear.

Acknowledgements

I need to acknowledge all those who helped in this effort. There are too many to name them all, but in particular, Tom Clune (of NASA) has done a tremendous job organizing the subgroup and representing the ideas to the committee. The committee has also been very understanding and provided us great constructive feedback.

http://everythingfunctional.wordpress.com/?p=339

Extensions

Because Reasons, or Just Because

everythingfunctional Nov 24, 2021

Normally I try not to mix politics and my career, but I kind of have to weigh in on this one. To preface this discussion, I have received the COVID-19 vaccine, and generally have a pro-vaccine position. However, I am a naturally skeptical and questioning person, so I can understand why people may be hesitant, … Continue reading Because Reasons, or Just Because →

Show full content

Normally I try not to mix politics and my career, but I kind of have to weigh in on this one. To preface this discussion, I have received the COVID-19 vaccine, and generally have a pro-vaccine position. However, I am a naturally skeptical and questioning person, so I can understand why people may be hesitant, especially with this vaccine, and I don’t think it is fair to demonize people for raising questions or having some resistance. I also don’t want to criticize either of the people I report to directly in this situation. They have made attempts to be as accommodating as possible and I commend them for that.

I am now in a position of being forced to either share my personal medical information or leave my current employment. As an employee of a company with a contract with a federal agency, I am included in the mandate covered by the President’s Executive Order. Now to be clear, the actual wording is a bit more manipulative. All of my employer’s federal contracts will be terminated if they do not verify that all of their employees are vaccinated, i.e. see their vaccination cards. So if I don’t comply, it’s not just me that bears the consequences.

You might at this point be saying, “But this is ensuring workplace safety. It’s perfectly normal to implement precautions to prevent employees from being exposed to potentially deadly pathogens.” Well, I’m a fully remote worker. My job does not require me to go to an office with any of my fellow employees, or onsite to any federal offices. So the representative for our contract specifically asked (and kudos to him for doing so) in a public meeting whether there would be any exemptions allowed for fully remote workers. The answer was an emphatic “NO”. So is there really about health and safety, or is it just comply or else?

It would be bad enough if it was just me getting caught up in this. I have a 7 year old son. With the vaccines starting to be approved for children, my wife and I consulted with a few medical professionals about whether we should get him vaccinated. He is home-schooled and perfectly healthy, so his chances of exposure, let alone serious illness, are quite low. For this reason, all of the doctors we asked said there wasn’t really much medical reason for him to be vaccinated. But they all suggested he might want to get it anyways, because he is likely to be restricted from certain things without it. In other words, medical decisions about my son are being influenced almost entirely on the basis of political factors.

So, even if there were no reasons to question the safety or efficacy of the vaccines, which I won’t try to argue against other than to say that there are at least not zero reasons, it’s pretty clear this is about more than just health and safety. This is an example of the political class finding an excuse to demand that we all comply with their authority or else, nuances or applicability to any particular situation be damned. Whether you’re in favor of this particular policy or not, it sets a potentially dangerous precedent.

I haven’t yet decided exactly what actions I will take. But the fact that we as a culture are deferring our medical decisions to political authorities with no capacity for nuance seems like a path towards a totalitarian, anti-science, dystopian society. It makes one wonder what happened to the free country we were supposed to be living in. I’m curious what disease the next vaccine mandate will be for, and whether the vaccine will be as safe and effective as this one.

I don’t want to convince you to get the vaccine or not. I don’t know about your situation, and you should consult trained medical professionals for that kind of advice. I just hope we can avoid demonizing and ostracizing people for having questions or asking for autonomy.

http://everythingfunctional.wordpress.com/?p=330

Extensions

A What Test?

everythingfunctional Sep 9, 2021

We all know what happens when you make assumptions. Of course, when things are obvious to ourselves, we tend not to notice the possibility they may not be obvious to others. Mea culpa, I made exactly that mistake recently. But in my aspirations to teach software development, that provides me with a great opportunity. I … Continue reading A What Test? →

Show full content

We all know what happens when you make assumptions. Of course, when things are obvious to ourselves, we tend not to notice the possibility they may not be obvious to others. Mea culpa, I made exactly that mistake recently. But in my aspirations to teach software development, that provides me with a great opportunity. I get to explain something that I understand so well, I thought it was obvious to everyone. So what are we talking about today? What is a unit test?

I work with a lot of Fortran programmers that wouldn’t give themselves that title. They are scientists and engineers who happen to write code, because it seems the most efficient way for them to get their work done. That the product of this is software seems almost a strange afterthought to many who see themselves in this role. I started my career this way, so I can understand the sentiment.

I was in the middle of a pair programming session to implement some new functionality in a program, and I was writing a unit test when the people I was working with said something that took me back. They said “How does *main program* know to call this? Is it a new option in the input file?” And then I realized I hadn’t explained what a unit test is.

So what did they think a test was? You run the whole program, and look at the outputs. Maybe it’s modeling some experiment and seeing that things look like the measurements, or we compare to a case we can solve analytically, or just make some basic sanity checks. The point is, it’s a very manual process, and involves executing the whole program. They’re not software developers, why would they think anything different?

Let’s take a look at an example. It’s a bit contrived, but it’s something like what you mind find an engineer writing. I’ve organized it a bit more than you might find in the wild, but that should hopefully make the exercise easier to follow. We’re going to imagine a program that generates values for polynomial functions. Something like the following.

program polynomial_point_generator
    use polynomials_m, only: constant, linear, quadratic
    implicit none
    integer :: num_points, polynomial_degree, i
    real :: x_start, x_increment
    real, allocatable :: xs(:), ys(:), polynomial_coefficients(:)

    print *, "Enter the " &
            // "number of points desired, " &
            // "polynomial degree, " &
            // "x starting point, " &
            // "and x increment:"
    read(*, *)  &
            num_points, &
            polynomial_degree, &
            x_start, &
            x_increment
    allocate(polynomial_coefficients(0:polynomial_degree))
    print *, "Enter polynomial coefficients:"
    read(*, *) polynomial_coefficients
    xs = [(x_start + (i-1)*x_increment, i = 1, num_points)]
    select case (polynomial_degree)
    case (0)
        ys = constant(xs, polynomial_coefficients(0))
    case (1)
        ys = linear( &
                xs, &
                polynomial_coefficients(0), &
                polynomial_coefficients(1))
    case (2)
        ys = quadratic(&
                xs, &
                polynomial_coefficients(0), &
                polynomial_coefficients(1), &
                polynomial_coefficients(2))
    end select
    do i = 1, num_points
        print *, xs(i), ys(i)
    end do
end program

module polynomials_m
    implicit none
    private
    public :: constant, linear, quadratic
contains
    elemental function constant(x, c0) result(y)
        real, intent(in) :: x, c0
        real :: y

        y = c0
    end function

    elemental function linear(x, c0, c1) result(y)
        real, intent(in) :: x, c0, c1
        real :: y

        y = c0 + x*c1
    end function

    elemental function quadratic(x, c0, c1, c2) result(y)
        real, intent(in) :: x, c0, c1, c2
        real :: y

        y = c0 + x*c1 + x**2*c2
    end function
end module

You might test such a program manually by having it generate a handful of points for a handful of cases and checking the outputs, perhaps even graphing the linear and quadratic cases to visually verify they look correct. For something like this example that makes sense, and is perfectly reasonable. The problem comes when we’re working on substantially more complex programs, where the relationships between the inputs and outputs is complicated and non-linear. In that case, trying to test all the possible variations in inputs becomes very labor intensive, and verifying that the outputs are correct becomes very difficult. Not to mention it leaves you vulnerable to the logical fallacy of confirmation bias; I.e. it looks like I expected, so it must be right.

That’s where unit testing comes in. Unit testing is a technique that allows us to test a part of our code, independently from the rest of the program. Preferably, we automate those tests and define objective criteria by which to judge whether they pass or fail so that we can remove a lot of the manual effort and potential for human error in testing our code.

To continue on with our example, imagine we are tasked with adding the capability of generating cubic functions to our program. I highly encourage you to write unit tests for any new functionality you add. Even if you’re not going to go back and write unit tests for the existing stuff, at least write unit tests for the new stuff. And so I wrote something like the following.

BIG DISCLAIMER: These tests do not follow best practices that I recommend. I’ve stripped away all the extra complexity associated with those patterns and techniques to make the example easier to follow. Please don’t write your real tests like this. Although, even these are better than no tests at all.

module cubic_test
    use polynomials_m, only: cubic, quadratic

    implicit none
    private
    public :: &
            test_all_zero_coefficients, &
            test_just_x_cubed, &
            test_matching_quadratic
contains
    function test_all_zero_coefficients() result(passed)
        logical :: passed

        if (0.0 == cubic(42.0, 0.0, 0.0, 0.0, 0.0)) then
            passed = .true.
        else
            passed = .false.
        end if
    end function

    function test_just_x_cubed() result(passed)
        logical :: passed

        if (42.0**3 == cubic(42.0, 0.0, 0.0, 0.0, 1.0)) then
            passed = .true.
        else
            passed = .false.
        end if
    end function

    function test_matching_quadratic() result(passed)
        logical :: passed

        if ( &
                quadratic(42.0, 1.0, 2.0, 3.0) &
                == cubic(42.0, 1.0, 2.0, 3.0, 0.0)) then
            passed = .true.
        else
            passed = .false.
        end if
    end function
end module

So the people I was working with saw me writing these functions and said “where do you call these in the program?” And that’s when I realized we were still thinking completely differently about testing. These functions are not called from the program the user runs. They are called from a separate program that looks something like the following.

program unit_tests
    use cubic_test, only: &
            test_all_zero_coefficients, &
            test_just_x_cubed, &
            test_matching_quadratic

    implicit none

    print *, "Test cubic with all zero coefficients"
    if (test_all_zero_coefficients()) then
        print *, "  passed"
    else
        print *, "  failed"
    end if

    print *, "Test calculating just x**3"
    if (test_just_x_cubed()) then
        print *, "  passed"
    else
        print *, "  failed"
    end if

    print *, "Test with cubic coefficient of zero " &
            // "matches quadratic function"
    if (test_matching_quadratic()) then
        print *, "  passed"
    else
        print *, "  failed"
    end if
end program

Note how these tests do not require any human input, nor human judgement to tell us whether they passed or failed. We can run these tests at any time to easily determine if the code we are testing works as expected. We have taken what would have been a laborious process and automated it. And by testing our program in pieces, we can easily narrow down where any bugs might be if something goes wrong. We also do not have to devise all the inputs to our whole program in order to test one piece of it.

There is a lot more to be said about how to write good unit tests, and how to design your software to make it easier to write unit tests. Stay tuned, cause those are things I intend to keep talking and writing about. Also, let me know was this article helpful, confusing, was there something you’d like to hear more about? I want to produce the stuff you guys find valuable so feedback is always appreciated. And if you’re interested in learning more and applying this stuff to your own projects, I’m available for training and coaching, so please do contact me.

http://everythingfunctional.wordpress.com/?p=313

Extensions

Which kinds are real?

everythingfunctional Aug 23, 2021

A discussion starting on the Fortran Discourse, got me thinking about how Fortran libraries should support multiple kinds of floating point numbers? I immediately recalled Dr. Fortran’s blog post, but that didn’t really consider the idea from a library developer’s perspective, just a standalone application developer. That’s understandable since at that time there wasn’t an … Continue reading Which kinds are real? →

Show full content

A discussion starting on the Fortran Discourse, got me thinking about how Fortran libraries should support multiple kinds of floating point numbers? I immediately recalled Dr. Fortran’s blog post, but that didn’t really consider the idea from a library developer’s perspective, just a standalone application developer. That’s understandable since at that time there wasn’t an ecosystem of Fortran libraries, so pretty much all Fortran developers were writing standalone applications. So the question remains.

For the history lesson, the aforementioned Dr. Fortran blog post does a great job. For a quick recap though, let’s take a look at what the Fortran Standard currently has to say about floating point type/kind. But first, a quick note about how the standard defines a processor. The definition in the standard reads:

combination of a computing system and mechanism by which programs are transformed for use on that computing system

So in the majority of cases that means a combination of compiler (although an interpreter would technically be possible), operating system and hardware. Thus, in theory anything that is allowed to differ between “processors” could be different if any of those things are changed.

Given the following excerpt:

The real type has values that approximate the mathematical real numbers. The processor shall provide two or more approximation methods that define sets of values for data of type real.

And given that type declarations that must be supported are real (with an optional kind parameter) and double precision, and that:

The type specifier for the real type uses the keyword REAL. The keyword DOUBLE PRECISION is an alternative specifier for one kind of real type.
If the type keyword REAL is used without a kind type parameter, the real type with default real kind is specified and the kind value is KIND (0.0). The type specifier DOUBLE PRECISION specifies type real with double precision kind; the kind value is KIND (0.0D0). The decimal precision of the double precision real approximation method shall be greater than that of the default real method.
The decimal precision of double precision real shall be at least 10, and its decimal exponent range shall be at least 37. It is recommended that the decimal precision of default real be at least 6, and that its decimal exponent range be at least 37.

We can see that there are only guaranteed to be two kinds of real supported by a processor. Furthermore, what exactly the precision or storage size of those kinds are is very open-ended. But what about the iso_fortran_env module and the values real_kinds, real32, real64, and real128? For a given processor it is guaranteed that the kind values corresponding to default real and double precision will be in the real_kinds array, but it is not guaranteed that either will be one of real32, real64 or real128, nor is it guaranteed that any of real32, real64, or real128 be valid.

So what is a library developer supposed to do? Well, the library is only guaranteed to be usable everywhere if they stick to default real and double precision. But users of the library (the application developers) want to be able to take advantage of the hardware they’re running on, and chances are pretty good the kinds mapping to the hardware representation won’t correspond to default real or double precision, and possibly not even to real32, real64, or real128.

For now, the workaround I’ve heard described amounts to something like the following. Note that this must all be done on the system on which the application is intended to be executed.

Compile a program to output the contents of the real_kinds array
Execute that program to determine what kind values are available
Use those outputs to execute a preprocessor on the library to generate code for all the available kinds
compile the library
link the library with the application

At this point it is clear, this is a very brutal thorn in the side of Fortran library developers. I was discussing this with a colleague and he said to me he thought this was the most compelling example he had heard motivating the development of the generics facility slated for inclusion in the 202Y (the one after next) edition of the standard. If it does nothing else, allowing library developers to write kind agnostic code will alleviate a major pain point.

http://everythingfunctional.wordpress.com/?p=301

Extensions

Setting Up Windows Fortran Development

everythingfunctional May 26, 2021

If you’re just getting started with Fortran and your primary computer is Windows, figuring out how to get everything you need installed and configured can be a bit tricky. In this post I’ll provide links and some details for how I set up a Windows machine for doing Fortran development. Your mileage may of course … Continue reading Setting Up Windows Fortran Development →

Show full content

If you’re just getting started with Fortran and your primary computer is Windows, figuring out how to get everything you need installed and configured can be a bit tricky. In this post I’ll provide links and some details for how I set up a Windows machine for doing Fortran development. Your mileage may of course vary, but this is the setup I prefer.

NOTE: I’ve got a YouTube video available to go along with this post that walks through exactly all the steps to get this all working, so check that out if watching me do it would make it easier.

Git

Install git by downloading the installer from git-scm.com. Some of the options I choose are:

Include all the Unix tools in the Windows command prompt
fast-forward only for pull behavior
checkout Windows line endings and commit Unix line endings
change the default branch name to main

Gfortran

The installer at equation.com is the best one I’ve found. It just works. Note, when I tried the latest version (11.1) it froze Windows, but the 10.3 version works great.

fpm

Download the Windows executable of the latest release from the GitHub page and put it in the folder C:\Users\USERNAME\AppData\Roaming\local\bin with the name fpm.exe. Don’t forget to add that folder to your PATH environment variable as well.

Python

What? Python? Why? The fortran-language-server, which is the backend for some of the Fortran IDE plugin features, is written in Python. Download and install Python from their website. Then use pip to install the package with the command pip install fortran-language-server.

Atom

I use the text editor Atom. Download it from here. Once installed, install the following plugins:

fortran-language
linter-gfortran
ide-fortran
atom-ide-ui

With all that, you should be ready to develop some Fortran code!

http://everythingfunctional.wordpress.com/?p=290

Extensions

The Fortran Package Manager’s First Birthday

everythingfunctional Mar 12, 2021

It was about 1 year ago that I attended my first Fortran standards committee meeting, more formally known as a J3 committee meeting. It was there that I met some brilliant people I’m now happy to call colleagues. It was a very interesting experience, and I’d recommend it to anyone interested in learning about how … Continue reading The Fortran Package Manager’s First Birthday →

Show full content

It was about 1 year ago that I attended my first Fortran standards committee meeting, more formally known as a J3 committee meeting. It was there that I met some brilliant people I’m now happy to call colleagues. It was a very interesting experience, and I’d recommend it to anyone interested in learning about how design by committee languages in general, or Fortran specifically are developed. But what was most exciting was that by the end of the meeting, Ondrej and I had decided to collaborate on the prototype of the Fortran Package Manager (fpm).

Ondrej had already been toying around with the start of an implementation written in Rust. He had chosen Rust because he wanted to model the Fortran package manager after Rust’s, named Cargo. It was an idea I’d had rolling around my head for a while as well. I suggested to him that we switch to using Haskell, as I already had a quite complete Fortran build system written in Haskell using the Shake library. He agreed, we put together a rough outline of the features we’d need and some conventions for packages, and set to work on a prototype.

The meeting was the last week of February 2020, and I show from the git logs for fpm, that a version enabling dependencies was merged at the end of May, and logs from my own projects show that I was switching them over to using fpm in late May to early June. It was not long afterwards that others started to try it out and suggest features and how the user interface should be designed.

It was very quickly that many interested in using fpm also expressed some interest in helping with its development. There was one problem. Most people were not familiar with Haskell, making that a significant barrier to their being able to contribute. But everybody had significant experience (or at least interest) in Fortran. It was thus that we decided we would rewrite fpm in Fortran, enabling a larger portion of its intended users to also be contributors. With the initial prototype usable for developing libraries, and many already converting their existing libraries to be fpm compatible, it didn’t seem quite so daunting a task as it otherwise would have. On July 21st, the start of that effort was submitted and merged.

It turned out to be a great idea. By mid November we created an initial Alpha (version 0.1.0) release. It included both the Haskell and Fortran versions, but already they were nearly equivalent in terms of capabilities. And while I did have an early hand in contributing to the Fortran version, a large fraction of the work was done by contributors who weren’t even involved in the project until soon before or even after the decision to rewrite in Fortran. I must express a very hearty kudos and thanks to those contributors.

At this point I don’t think I’ve made source code contributions in more than a couple of months. I stay active and engaged in the discussions and make suggestions for the design of the user interface. When I first started the project I had some concern that I might end up being BDFL and if I ever got busy with other things the project might languish and die. It’s clear now that the community isn’t going to let that happen. They probably don’t really even need me anymore.

To have gotten to this point in less than a year is an incredible success story for an open source project. I attribute much of the success to the fortran-lang.org project for helping to publicize the project and attract users and contributors. Ondrej and Milan have already written about their experiences over the last year working on that project. But I also believe it underscores just how desperately the Fortran community has needed investment in modern tools.

I think it’s clear now that the open source Fortran community is fully on board with developing and adopting modern tools to make their lives easier. They’re also committed to making the language more inviting to new users, including having a curated and well written collection of tutorials in a convenient, centralized place. There is still plenty left to do, but I can see now that the goal I had in mind not much more than a year ago is well on its way to being accomplished, if not already there.

http://everythingfunctional.wordpress.com/?p=268

Extensions

Is Research Software Likely to Remain a Tangled Mess?

everythingfunctional Feb 26, 2021

No. I recently came across a post in the fortran-lang forum referring to a blog post that makes an assertion that research software is likely to remain a tangled mess. I have a sufficient amount to say that should hopefully refute such an assertion, and give us some hope for the future. One of the … Continue reading Is Research Software Likely to Remain a Tangled Mess? →

Show full content

No.

I recently came across a post in the fortran-lang forum referring to a blog post that makes an assertion that research software is likely to remain a tangled mess. I have a sufficient amount to say that should hopefully refute such an assertion, and give us some hope for the future.

One of the points made is that researchers simply do not have the time or desire to study the art of software engineering and master that skill. I’m not entirely sure that is true. For starters, researchers are generally members of institutions of higher learning. Why should they feel disinclined from learning? Perhaps the incentive structures in place at universities and laboratories really is that far out of whack, but I haven’t experienced that myself. And at any rate it wouldn’t be much different than suggesting researchers are too busy to learn math. It’s a necessary skill and we should encourage researchers to treat it as such.

To the point idea of incentive structures, it was suggested that researchers receive no recognition for the software they write, only for the results. But that’s not true anymore. Methods are being developed for citing research software directly. In fact there is now a journal specifically targeting the publication of research software, and a sister journal for publishing educational software so that researchers and educators can get credit for the software they develop.

Another couple of points made are that researchers tend to leave the code in a state that it is difficult to build or doesn’t work on different systems, and that they tend to reinvent the wheel for many things because they simply don’t know other solutions are available. There is an active community working diligently to address these problems. For example Fortran is a programming language specifically aimed for use by scientists and engineers. Now that tools like the Fortran Package Manager are available and with the community support behind fortran-lang.org, building, making available for others, and finding existing solutions for Fortran software has become much easier.

Finally, having cleaner code would actually make research efforts go faster, not slower. All that time spent fixing flaky build systems, chasing down bizarre bugs, and deciphering cryptic spaghetti code could be used to solve real problems. The scientific community would be so much more productive, with it being so much easier to collaborate and make use of each others’ work. In the same way that we don’t make excuses for surgeons not to wash their hands and keep their instruments organized because they’re just too busy, we shouldn’t make excuses for researchers not to keep their code organized and reusable.

So yes, there are challenges facing the adoption of better software practices for research software. But as I’ve shown, those challenges are being addressed. I don’t think research software is doomed to forever remain a tangled mess.

http://everythingfunctional.wordpress.com/?p=276

Extensions

Communicating with Whom?

everythingfunctional Jan 20, 2021

Most lay people and programmers early in their career begin with the assumption that we write code to communicate to the computer what we’d like it to do, and how. While this is partly true, it misses a much more important audience; people. And for much greater reason than humans must be able to maintain … Continue reading Communicating with Whom? →

Show full content

Most lay people and programmers early in their career begin with the assumption that we write code to communicate to the computer what we’d like it to do, and how. While this is partly true, it misses a much more important audience; people. And for much greater reason than humans must be able to maintain it.

Early in the days of computing, code was indeed written solely for the purposes of getting a computer to perform some calculation or task. People wrote code in binary, hand toggled it into a machine, and hand translated the outputs back into a form to be understood by humans. But it did not take long before we devised more convenient forms and tools to make the code and the machine’s outputs more easily understood by humans. After the decades of research that followed in computer science and programming language design, I contend that software can now be much more powerfully thought of as a formal specification for knowledge documentation and use.

As an example, large volumes of text have been written on every subject imaginable, as ways of transferring knowledge from human being to human being. Particularly in fields of science, text books are written to describe observations and assumptions made, deductions that can be followed, conclusions drawn and their implications, all supported with models, mathematical equations and logical reasoning. As an example of where we’re headed mathematics was developed as way of providing concise notations and methods of reasoning about numerical concepts. It was then adopted by science as a way of precisely documenting the knowledge arrived at about how the universe behaves.

In the same vein, computer science and programming language design have been developed as ways of making the intentions of the programmer (i.e. the knowledge they wish to document) easier to describe more precisely and formally, in ways that are clearer to human readers. In this light, software can be seen as the precise, formal notation for documenting any human knowledge and/or desire in the same way that math can be seen as the precise, formal notation for documenting the nature of the universe. Strangely it seems this even managed to occur without it ever having been clear that that was the actual goal. The goal was more about making programers more productive. But of course programmers are more productive when they can more easily understand the knowledge (code) their colleges have documented (developed).

But that’s not all. Looking at software this way, we can surmise why it has so quickly “eaten” the world and become so ubiquitous. Because it is so formal and precise that it can be executed by a machine, it allows other human beings to put that knowledge to work without having to first understand it themselves! I can take a piece of software written by someone else, and knowing only a bare minimum about how to interact with it, let alone its inner workings, use it to accomplish some goal. For instance, I know next to nothing about how my web browser works, the java script it’s running to allow me to write this blog post, the networking software used to transmit it back to some server, the database engine used to store that data, or whatever allowed you to find it, but I have still managed to communicate this message to more people than I’ll probably ever know. I can’t do that with an engineering text book.

Now don’t get me wrong, I’m not simply going to tell every developer I know about this and magically we’ll all have better software. Unfortunately it’s not that simple. Learning to write software this way is a bit like relearning how to write an essay. You probably weren’t very good at composing a sentence when you were 6, you probably still weren’t great at composing paragraphs by 8, and probably just starting to improve at writing essays by the time you were 10. It takes years of study to get good at writing clear, coherent prose. And it takes years of study to get good at writing clear, coherent software. I’ve been at it for nearly a decade now and still come back to code I wrote 6 months ago thinking “this could be so much better.” So practice. It might be a year or two before you’re really good at writing clear and concise functions, another couple years still to compose coherent modules, and more years still before you’ll be writing whole systems you and your colleagues will be proud of. And even then you’ll still come back 6 months later and go “eh, this could be better.”

But don’t stop, because you’re at the forefront of documenting human knowledge, and making it a powerful tool for people who won’t even have to understand it.

http://everythingfunctional.wordpress.com/?p=260

Extensions

Difficulties With Test Metrics

everythingfunctional Jan 14, 2021

The answer to the question of whether we should write automated test suites has largely been settled. We absolutely should write unit tests, and possibly even integration and end-to-end tests. But as acceptance of this practice grew, and adoption became more widespread, a follow-up question arose; How many tests should we write? How do we … Continue reading Difficulties With Test Metrics →

Show full content

The answer to the question of whether we should write automated test suites has largely been settled. We absolutely should write unit tests, and possibly even integration and end-to-end tests. But as acceptance of this practice grew, and adoption became more widespread, a follow-up question arose; How many tests should we write? How do we know when we’ve written enough? And thus was born the metric of code coverage.

The problem with the metric of code coverage is that while it gives a nice number for managers to point at and say “this is how far along we are constructing our test suite,” it doesn’t actually measure the important aspect that we’re interested in. What we’d really like to know is have we formally defined and tested all of the requirements of our software. The metric of code coverage does not answer this question.

If for example, I have 80% code coverage from my test suite, it could mean that I’ve only tested 80% of my requirements, or it could mean that 20% of the code in my software is completely unnecessary. And even if I have 100% code coverage, it may be that I still have not tested some requirement. It could be that those lines were not executed with a sufficient range of possible values. And I’ve also up to now ignored the implicit assumption that your tests are actually testing the results and not just there for the purposes of gaming the metric. Of course software developers would never attempt to game some meaningless metric to please their managers \s.

So if code coverage isn’t a metric we should be particularly concerned with, is there a better one? Unfortunately, none that I know of. We may be able to write down a list of requirements, and point to some and say “these aren’t tested,” but additional difficulty comes in from the nature of what most software development is. We are quite frequently discovering what our requirements are through trial and error. There are often requirements we don’t know about. We must often develop the software as a way discovering what it’s supposed to do.

It is for this reason that I am a big proponent of the practices of test driven development (TDD) and pair programming. By adhering strictly to the rules of TDD that I don’t write any more code than is necessary to pass my current test suite, I know that I have only satisfied the requirements that I know and care about thus far. Thus, any missing requirements become apparent in the lack of some capability of the software. By involving domain experts in my pair programming sessions, I can be confident that my tests actually do specify and test the requirements.

This is why the clarity of your test suite code, and the capabilities of your testing framework are so important. If I am new to a project, and the test suite is hard to read and understand, and the testing framework can’t nicely report the requirements to me, then I can’t tell what the requirements of the software actually are. If I can’t tell what the requirements of the software are, I’m going to have a hard time making modifications that conform to them.

So spend less time worrying about code coverage, and more time worrying about requirements coverage and your test suite’s comprehensibility.

http://everythingfunctional.wordpress.com/?p=254

Extensions

https://everythingfunctional.wordpress.com/feed

Posts