Tutorial: Difference between revisions

From openpipeflow.org
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 36: Line 36:
  > less Re2400a1.25/main.info
  > less Re2400a1.25/main.info


Edit the current parameters so that they are the same
In the previous terminal window, edit the parameters so that they are the same as in the given <tt>main.info</tt> file
  > nano program/parameters.f90
  > nano program/parameters.f90
You may/should ignore from i_KL onwards.
You should ignore from 'i_KL' onwards.


=== Compile and setup a job directory ===
=== Compile and setup a job directory ===
Line 60: Line 60:
       state.cdf.in
       state.cdf.in


=== Starting the run ===
=== Start the run ===


* Next an '''initial condition''' <tt>state.cdf.in</tt> is needed. '''NOTE''': Any output state, e.g. state0012.cdf.dat can be copied to <tt>state.cdf.in</tt> to be used as an initial condition. If resolutions do not match, they are '''automatically interpolated or truncated'''.  
To start the run
<pre>&gt; mv install ~/runs/job0001
> nohup ./main.out > OUT 2> OUT.err &
&gt; cd ~/runs/job0001/
&gt; cp .../state0012.cdf.dat state.cdf.in</pre>


* '''To start''' the run (good to first double-check in [[main.info]] that the executable was compiled with correct parameters)
'&' puts the job in the background.
<pre>&gt; nohup ./main.out &gt; OUT 2&gt; OUT.err &amp;</pre>
'nohup' allows you to logout from the terminal window without 'hangup' - otherwise, a forced closure of the window could kill the job.
After a few moments, press enter again to see if main.out stopped prematurely.  If it has stopped there will be a message e.g. '[1]+ Done nohup...'; check <tt>OUT.err</tt> or <tt>OUT</tt> for clues why.
Output and errors normally sent to the terminal window are redirected to the OUT files.
 
* '''To end''' the run, type
<pre>&gt; rm RUNNING</pre>
and press enter.  This signals to the job to terminate (cleanly). 
Wait a few seconds then press enter again. There should be a message like,
'[1]+ Done nohup...', to say that the job has has ended.


A few seconds after starting the job, press enter again.
If there is a message like
'[1]+ Done nohup...'
then it is likely that there was an error.  In that case, try
> less OUT
> less OUT.err
If there is a message about MPI libraries, and earlier you changed <tt>_Np</tt> from another value to 1, then you could try
running instead with
> mpirun -np 1 ./main.out > OUT 2> OUT.err &
You might need to include a path to mpirun; search your <tt>Makefile</tt> for the mpirun command.


* '''NOTE''': I generate almost all initial conditions by taking a state from a run with similar parameters. If there is a mismatch in <tt>i_Mp</tt>, use the utility [[changeMp.f90]].
=== Monitor the run ===


=== Monitoring a run ===
> tail OUT
 
Immediately after starting a job, it’s a good idea to check for any warnings
 
<pre>&gt; less OUT</pre>
To find out number of timesteps completed, or for possible diagnosis of an early exit,
 
<pre>&gt; tail OUT</pre>
The code outputs timeseries data and snapshot data, the latter has a 4-digit number e.g. <tt>state0012.cdf.dat</tt>.
The code outputs timeseries data and snapshot data, the latter has a 4-digit number e.g. <tt>state0012.cdf.dat</tt>.


To see when the in the run each state was saved,
To see when the in the run each state was saved,
> grep state OUT | less  [OR]
> head -n 1 vel_spec* | less
I often monitor progress with
> tail vel_energy.dat
or
> gnuplot
> plot 'vel_energy.dat' w l


<pre>&gt; grep state OUT | less  [OR]
=== End the run ===
&gt; head -n 1 vel_spec* | less</pre>
I often monitor progress with <tt>tail vel_energy.dat</tt> or


<pre>&gt; gnuplot
Type
&gt; plot 'vel_energy.dat' w l</pre>
> rm RUNNING
Use <tt>rm RUNNING</tt> to end the job.
and press enter.  This signals to the job to terminate (cleanly). 
Wait a few seconds then press enter again. There should be a message like,
'[1]+ Done nohup...', to say that the job has has ended.


=== Making a util ===
=== Make a util ===


The core code in <tt>program/</tt> rarely needs to be changed. Almost anything can be done by creating a utility instead.
The core code in <tt>program/</tt> rarely needs to be changed. Almost anything can be done by creating a utility instead.
Line 112: Line 115:
Really, only the last command is necessary, which creates the executable <tt>prim2matlab.out</tt>.
Really, only the last command is necessary, which creates the executable <tt>prim2matlab.out</tt>.
It is good practice, however, to do the previous commands to generate a [[main.info]] file to keep alongside the executable.
It is good practice, however, to do the previous commands to generate a [[main.info]] file to keep alongside the executable.
=== Visualise ===
TO DO

Revision as of 08:20, 8 September 2014

This page is currently under revision! [today:2014/09/08].

If you haven't already, read through the Getting_started page. Skip the Getting_started#Compiling_libraries section if someone has set up the libraries and Makefile for you.

The following assumes that the code has been downloaded (Main_Page#Download), and that libraries have been correctly installed (Getting_started#Compiling_libraries), so that the command 'make' does not exit with an error.

Where to start from - initial conditions and the main.info file

The best input initial condition is usually the output state from another run, preferably from a run with similar parameter settings. Output state files are named state0000.cdf.dat, state0001.cdf.dat, state0002.cdf.dat, and so on. Any of these could be used as a potential initial condition. If resolution parameters do not match, then they automatically interpolated or truncated to the new resolution (the resolution selected at compile time).

Download the following file from the Database: File:Re2400a1.25.tgz Extract the contents:

> tar -xvvzf Re2400a1.25.tgz

This should produce a directory Re2400a1.25/ containing an output state file state0010.cdf.dat and a main.info file. The main.info file is a record of parameter settings that were used when compiling the executable that produced the state file.

Set your parameters

We will assume serial use (for parallel use see Getting_started#Typical_usage).

The number of cores is set in parallel.h. Ensure that the number beside _Np is 1:

> head parallel.h
  ...
  #define _Np 1
  ...

If not, edit with your favourite text editor, e.g.

> nano parallel.h   [OR]
> pico parallel.h   [OR]
> gedit parallel.h

In another terminal window, take a look at the main.info downloaded a moment ago

> less Re2400a1.25/main.info

In the previous terminal window, edit the parameters so that they are the same as in the given main.info file

> nano program/parameters.f90

You should ignore from 'i_KL' onwards.

Compile and setup a job directory

After setting the parameters, we need to create an executable that will run with the settings we've chosen. To compile the code with the current parameter settings

> make 
> make install

If an error is produced, go back to the top and check that libraries and Makefile are set up correctly. The second command creates the directory install/ and a new main.info file. Optionally, you could check for any differences between the new and given parameters

> diff install/main.info Re2400a1.25/main.info

We'll create a new job directory with an initial condition in there ready for the new run

> cp Re2400a1.25/state0010.cdf.dat install/state.cdf.in
> mkdir ~/runs/
> mv install ~/runs/job0001
> cd ~/runs/job0001
> ls -l
      main.info
      main.out
      state.cdf.in

Start the run

To start the run

> nohup ./main.out > OUT 2> OUT.err &

'&' puts the job in the background. 'nohup' allows you to logout from the terminal window without 'hangup' - otherwise, a forced closure of the window could kill the job. Output and errors normally sent to the terminal window are redirected to the OUT files.

A few seconds after starting the job, press enter again. If there is a message like

'[1]+ Done nohup...'

then it is likely that there was an error. In that case, try

> less OUT 
> less OUT.err

If there is a message about MPI libraries, and earlier you changed _Np from another value to 1, then you could try running instead with

> mpirun -np 1 ./main.out > OUT 2> OUT.err &

You might need to include a path to mpirun; search your Makefile for the mpirun command.

Monitor the run

> tail OUT

The code outputs timeseries data and snapshot data, the latter has a 4-digit number e.g. state0012.cdf.dat.

To see when the in the run each state was saved,

> grep state OUT | less   [OR]
> head -n 1 vel_spec* | less

I often monitor progress with

> tail vel_energy.dat

or

> gnuplot
> plot 'vel_energy.dat' w l

End the run

Type

> rm RUNNING

and press enter. This signals to the job to terminate (cleanly). Wait a few seconds then press enter again. There should be a message like, '[1]+ Done nohup...', to say that the job has has ended.

Make a util

The core code in program/ rarely needs to be changed. Almost anything can be done by creating a utility instead. There are many examples in utils/. Further information can be found on the Utilities page.

In Makefile, set UTIL = prim2matlab. In the utils/ directory there is a corresponding file prim2matlab.f90.

> make
> make install
> make util

Really, only the last command is necessary, which creates the executable prim2matlab.out. It is good practice, however, to do the previous commands to generate a main.info file to keep alongside the executable.

Visualise

TO DO