Skip to content
Snippets Groups Projects
Commit 9531792c authored by Lionel GUEZ's avatar Lionel GUEZ
Browse files

Polish

parent 7f5c905c
No related branches found
No related tags found
No related merge requests found
No preview for this file type
No preview for this file type
......@@ -2,6 +2,8 @@
\usepackage[utf8]{inputenc}
\usepackage{amsmath}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
......@@ -37,7 +39,7 @@ matlab -nojvm -r overlap
Sur le domaine Eurec4A entier (toutes les dates, soit 117 dates), au
total pour les deux orientations. \verb+inst_eddies_v6.py+ prend
environ 11 mn et produit 3 MiB, pour 2951 tourbillons
environ 12 mn et produit 3 MiB, pour 2951 tourbillons
instantanés. \verb+overlap.m+ prend 0 mn. \verb+overlap_v6.py+ prend 0
mn et produit 48 KiB, pour 2793 arêtes. \verb+survival.m+ prend 0 mn.
\verb+survival.py+ prend 0 mn et produit 64 KiB pour 2863 n\oe{}uds et
......@@ -56,6 +58,70 @@ stockage intermédiaire pour les fichiers v6. Le fichier produit par
\verb+inst_eddies.m+ prend environ le quart de l'espace du fichier de
départ.
\section{Identification of instantaneous eddies}
\label{sec:identification}
Instantaneous eddies with a given orientation are identified in two,
equivalent ways: either by a couple date $d$ and eddy index $e$ at
that date, or by a unique identifier $n$, which we can call a \og node
index\fg{}.
For a given date, let $d$ be the corresponding number of days since January
\nth{1}, 1950. Let $d_1$ be the value of $d$ for the first date of the
dataset. We call date index the integer:
\begin{equation*}
k = d - d_1
\end{equation*}
So the first value of $k$ is 0.
Let $e_{\mathrm{max}, o}(k)$ be the number of instantaneous eddies
with orientation $o$ at date index $k$. $e_{\mathrm{max}, o}(k)$ is
stored in Matlab variables Nanti et Ncyclo. We assume we know an
overestimate $E$ of $e_{\mathrm{max}, o}(k)$ for all $k$ and both
orientations. That is, we guarantee that:
\begin{equation*}
\forall (k, o), e_{\mathrm{max}, o}(k) \le E
\end{equation*}
The eddy indices $e$ for eddies with orientation $o$ at date index $k$
start at 1 and increment 1 by 1, without any jump. So they go from 1
to $e_{\mathrm{max}, o}(k)$.
The relation between node index $n$, date index $k$ and eddy index $e$
is:
\begin{equation*}
n = k E + e
\end{equation*}
So $n \ge 1$ and usually jumps at each change of date. The
anticyclones at the first date have a node index between 1 and $E$, at
the second date between $E + 1$ and $2 E$, and so on. Same for
cyclones. Cf. table (\ref{tab:eddy_id_Matlab}).
\begin{table}[htbp]
\centering
\begin{tabular}{lll}
date index 0 & & \\
& anticyclones & $1, \dots, e_\mathrm{max,anti}(0)$ \\
& cyclones & $1, \dots, e_\mathrm{max,cyclo}(0)$ \\
date index k & & \\
& anticyclones & $k E + 1, \dots, k
E + e_\mathrm{max,anti}(k)$ \\
& cyclones & $k E + 1, \dots, k E
+ e_\mathrm{max,cyclo}(k)$
\end{tabular}
\caption{Node indices in the Matlab program. $k$ is the date index,
starting at 0.}
\label{tab:eddy_id_Matlab}
\end{table}
Conversely, from the definition of $E$, knowing $n$ and $E$, we can
obtain $(k, e)$:
\begin{equation*}
k = \left \lfloor \frac{n - 1}{E} \right \rfloor
\end{equation*}
\begin{align*}
e & = n - k E \\
& = 1 + (n - 1) \bmod E
\end{align*}
\section{Instantaneous eddies}
The data for instantaneous eddies is stored in shapefiles, in the
......@@ -70,19 +136,21 @@ directories \verb+SHPC_(anti|cyclo)+. Cf. figure \ref{fig:convert_Matlab}.
\label{fig:convert_Matlab}
\end{figure}
The directory \verb+SHPC_(anti|cyclo)+ contains a set of four
shapefiles: center, extremum, \verb+max_speed_contour+ and
shapefiles: centroid, extremum, \verb+max_speed_contour+ and
\verb+outermost_contour+. The four shapefiles correspond to four \og
layers\fg{} of eddies. (\og layers\fg{} is a term that you can often
find in the documentation of software dealing with geographical data.)
Each layer corresponds to a given type of geometry. Here we have only
two types of geometry: points and polygons. The layers center and
two types of geometry: points and polygons. The layers centroid and
extremum contain points, while the layers \verb+max_speed_contour+ and
\verb+outermost_contour+ contain polygons. The center layer is for the
geometric center of the maximum-speed contour, which is called
centroid in the Matlab files. The extremum layer is for the position
of the extremum of SSH, which is called center in the Matlab
files. Each eddy has a record, at the same subscript position, in the
four layers. Cf. figure \ref{fig:SHPC}.
\verb+outermost_contour+ contain polygons. The centroid layer is for
the geometric center of the maximum-speed contour. The extremum layer
is for the position of the extremum of SSH, which is called center in
the Matlab files.
A shapefile contains \og shapes\fg{}. The shapes are identified by a
shape index $i$, starting at 0. Each eddy has a shape, at the same
shape index $i$, in the four layers. Cf. figure \ref{fig:SHPC}.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{SHPC}
......@@ -96,12 +164,40 @@ ending with suffixes \verb+.shp+, \verb+.dbf+ and \verb+.shx+. The
\verb+.shp+ file contains the positions, the \verb+.dbf+ file contains
the metadata, and the \verb+.shx+ file is an index. So the \verb+.shp+
file is the largest file of the three. But the three files form a
logical unit and you should never separate them. There is also a file
\verb+ishape_last.txt+ in the directory \verb+SHPC_(anti|cyclo)+ which
gives the last subscript in the shapefiles for each date. This file is
used to access directly any instantaneous eddy at any date. Finally,
there is a file, \verb+grid_nml.txt+, which gives (in Fortran namelist
format) the grid of SSH data from which the eddies were detected.
logical unit and you should never separate them.
The instantaneous eddies are identified, for a given orientation, by a
date and an eddy index at that date. See §
\ref{sec:identification}. These are fields (column headers) in the dbf
files. The date is given as the number of days since January \nth{1},
1950, in the field \verb+days_1950+. The eddy index is between 1 and
the number of eddies at the date, in the field \verb+eddy_index+.
The file \verb+ishape_last.txt+ in the directory
\verb+SHPC_(anti|cyclo)+ is used to access directly any instantaneous
eddy at any date. Let $l_o(k)$ be the shape index in the shapefiles,
for a given orientation $o$, of the last instantaneous eddy at date
index $k$:
\begin{equation*}
l_o(k) = \sum_{k' = 0} ^k e_{\mathrm{max}, o}(k') - 1
\end{equation*}
So the shape index $i$ of the eddy with eddy index $e$ at date index $k$ is:
\begin{equation*}
i =
\begin{array}{|ll}
e - 1 & \mathrm{if}\ k = 0 \\
l_o(k - 1) + e & \mathrm{if}\ k \ge 1
\end{array}
\end{equation*}
Also, note that, if $k \ge 1$ then:
\begin{equation*}
e_{\mathrm{max}, o}(k) = l(k) - l(k - 1)
\end{equation*}
\verb+ishape_last.txt+ gives $l_o(k)$ for all $k$.
Finally, the file \verb+grid_nml.txt+ in the directory
\verb+SHPC_(anti|cyclo)+ gives (in Fortran namelist format) the grid
of SSH data from which the eddies were detected.
The shapefiles are in binary format, so you need special software to
read them. There is actually a large number of programs to read
......@@ -131,25 +227,19 @@ the edges have a direction: the direction is chronological.
The file \verb+edgelist_(anti|cyclo)+ stores, for a given orientation
of eddies, the whole graph as a list of edges. (This is a common graph
storage format.) Each line stores one edge: the origin node of the
edge followed by the target node of the edge. One edge is defined by a
couple of integers: the date index and the eddy index at that
date. The date index is the number of days since January \nth{1},
1950. The eddy index is between 1 and the number of eddies at the
date.
edge followed by the target node of the edge. A node is identified by
a node index. Knowledge of the node index is equivalent to knowledge
of the date and the eddy index. See § \ref{sec:identification}.
\section{Survival}
We call survival the data that identifies several instantaneous
eddies as a same evolving physical object. The result of this analysis
is a dictionary of trajectories, stored in a file
The last part of the Matlab program has the task of recognizing
several instantaneous eddies as a same evolving physical object. We
call this part of the analysis the survival part. The result of this
analysis is a dictionary of trajectories, stored in a file
\verb+traj_(anti|cyclo).json+. The key for each trajectory is the
identifying number of the trajectory and the value is the
corresponding list of instantaneous eddies. Here, an instantaneous
eddy is not identified by the couple date index and eddy
index. Rather, it is identified by an equivalent identifying number
that is constructed from the date index and the eddy index. We could
call this number the \og node index\fg{}, since it identifies a node
in the abstract graph of overlapping. The relation between node index,
$n$, date index, $d$
eddy is identified by its node index.
\end{document}
......@@ -4,7 +4,8 @@
SHPC_anti and SHPC_cyclo. The mat files are assumed to be in v7.3
(HDF5) format and will be converted to v6 format before being read in
Python. The data in each input file is the set of detected
instantaneous eddies at a given date.
instantaneous eddies at a given date. The second, optional argument,
is the final date processed.
"""
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment