Skip to content
Snippets Groups Projects
Commit 9531792c authored by Lionel GUEZ's avatar Lionel GUEZ
Browse files

Polish

parent 7f5c905c
No related branches found
No related tags found
No related merge requests found
No preview for this file type
No preview for this file type
...@@ -2,6 +2,8 @@ ...@@ -2,6 +2,8 @@
\usepackage[utf8]{inputenc} \usepackage[utf8]{inputenc}
\usepackage{amsmath}
\usepackage[T1]{fontenc} \usepackage[T1]{fontenc}
\usepackage{lmodern} \usepackage{lmodern}
...@@ -37,7 +39,7 @@ matlab -nojvm -r overlap ...@@ -37,7 +39,7 @@ matlab -nojvm -r overlap
Sur le domaine Eurec4A entier (toutes les dates, soit 117 dates), au Sur le domaine Eurec4A entier (toutes les dates, soit 117 dates), au
total pour les deux orientations. \verb+inst_eddies_v6.py+ prend total pour les deux orientations. \verb+inst_eddies_v6.py+ prend
environ 11 mn et produit 3 MiB, pour 2951 tourbillons environ 12 mn et produit 3 MiB, pour 2951 tourbillons
instantanés. \verb+overlap.m+ prend 0 mn. \verb+overlap_v6.py+ prend 0 instantanés. \verb+overlap.m+ prend 0 mn. \verb+overlap_v6.py+ prend 0
mn et produit 48 KiB, pour 2793 arêtes. \verb+survival.m+ prend 0 mn. mn et produit 48 KiB, pour 2793 arêtes. \verb+survival.m+ prend 0 mn.
\verb+survival.py+ prend 0 mn et produit 64 KiB pour 2863 n\oe{}uds et \verb+survival.py+ prend 0 mn et produit 64 KiB pour 2863 n\oe{}uds et
...@@ -56,6 +58,70 @@ stockage intermédiaire pour les fichiers v6. Le fichier produit par ...@@ -56,6 +58,70 @@ stockage intermédiaire pour les fichiers v6. Le fichier produit par
\verb+inst_eddies.m+ prend environ le quart de l'espace du fichier de \verb+inst_eddies.m+ prend environ le quart de l'espace du fichier de
départ. départ.
\section{Identification of instantaneous eddies}
\label{sec:identification}
Instantaneous eddies with a given orientation are identified in two,
equivalent ways: either by a couple date $d$ and eddy index $e$ at
that date, or by a unique identifier $n$, which we can call a \og node
index\fg{}.
For a given date, let $d$ be the corresponding number of days since January
\nth{1}, 1950. Let $d_1$ be the value of $d$ for the first date of the
dataset. We call date index the integer:
\begin{equation*}
k = d - d_1
\end{equation*}
So the first value of $k$ is 0.
Let $e_{\mathrm{max}, o}(k)$ be the number of instantaneous eddies
with orientation $o$ at date index $k$. $e_{\mathrm{max}, o}(k)$ is
stored in Matlab variables Nanti et Ncyclo. We assume we know an
overestimate $E$ of $e_{\mathrm{max}, o}(k)$ for all $k$ and both
orientations. That is, we guarantee that:
\begin{equation*}
\forall (k, o), e_{\mathrm{max}, o}(k) \le E
\end{equation*}
The eddy indices $e$ for eddies with orientation $o$ at date index $k$
start at 1 and increment 1 by 1, without any jump. So they go from 1
to $e_{\mathrm{max}, o}(k)$.
The relation between node index $n$, date index $k$ and eddy index $e$
is:
\begin{equation*}
n = k E + e
\end{equation*}
So $n \ge 1$ and usually jumps at each change of date. The
anticyclones at the first date have a node index between 1 and $E$, at
the second date between $E + 1$ and $2 E$, and so on. Same for
cyclones. Cf. table (\ref{tab:eddy_id_Matlab}).
\begin{table}[htbp]
\centering
\begin{tabular}{lll}
date index 0 & & \\
& anticyclones & $1, \dots, e_\mathrm{max,anti}(0)$ \\
& cyclones & $1, \dots, e_\mathrm{max,cyclo}(0)$ \\
date index k & & \\
& anticyclones & $k E + 1, \dots, k
E + e_\mathrm{max,anti}(k)$ \\
& cyclones & $k E + 1, \dots, k E
+ e_\mathrm{max,cyclo}(k)$
\end{tabular}
\caption{Node indices in the Matlab program. $k$ is the date index,
starting at 0.}
\label{tab:eddy_id_Matlab}
\end{table}
Conversely, from the definition of $E$, knowing $n$ and $E$, we can
obtain $(k, e)$:
\begin{equation*}
k = \left \lfloor \frac{n - 1}{E} \right \rfloor
\end{equation*}
\begin{align*}
e & = n - k E \\
& = 1 + (n - 1) \bmod E
\end{align*}
\section{Instantaneous eddies} \section{Instantaneous eddies}
The data for instantaneous eddies is stored in shapefiles, in the The data for instantaneous eddies is stored in shapefiles, in the
...@@ -70,19 +136,21 @@ directories \verb+SHPC_(anti|cyclo)+. Cf. figure \ref{fig:convert_Matlab}. ...@@ -70,19 +136,21 @@ directories \verb+SHPC_(anti|cyclo)+. Cf. figure \ref{fig:convert_Matlab}.
\label{fig:convert_Matlab} \label{fig:convert_Matlab}
\end{figure} \end{figure}
The directory \verb+SHPC_(anti|cyclo)+ contains a set of four The directory \verb+SHPC_(anti|cyclo)+ contains a set of four
shapefiles: center, extremum, \verb+max_speed_contour+ and shapefiles: centroid, extremum, \verb+max_speed_contour+ and
\verb+outermost_contour+. The four shapefiles correspond to four \og \verb+outermost_contour+. The four shapefiles correspond to four \og
layers\fg{} of eddies. (\og layers\fg{} is a term that you can often layers\fg{} of eddies. (\og layers\fg{} is a term that you can often
find in the documentation of software dealing with geographical data.) find in the documentation of software dealing with geographical data.)
Each layer corresponds to a given type of geometry. Here we have only Each layer corresponds to a given type of geometry. Here we have only
two types of geometry: points and polygons. The layers center and two types of geometry: points and polygons. The layers centroid and
extremum contain points, while the layers \verb+max_speed_contour+ and extremum contain points, while the layers \verb+max_speed_contour+ and
\verb+outermost_contour+ contain polygons. The center layer is for the \verb+outermost_contour+ contain polygons. The centroid layer is for
geometric center of the maximum-speed contour, which is called the geometric center of the maximum-speed contour. The extremum layer
centroid in the Matlab files. The extremum layer is for the position is for the position of the extremum of SSH, which is called center in
of the extremum of SSH, which is called center in the Matlab the Matlab files.
files. Each eddy has a record, at the same subscript position, in the
four layers. Cf. figure \ref{fig:SHPC}. A shapefile contains \og shapes\fg{}. The shapes are identified by a
shape index $i$, starting at 0. Each eddy has a shape, at the same
shape index $i$, in the four layers. Cf. figure \ref{fig:SHPC}.
\begin{figure}[htbp] \begin{figure}[htbp]
\centering \centering
\includegraphics[width=\textwidth]{SHPC} \includegraphics[width=\textwidth]{SHPC}
...@@ -96,12 +164,40 @@ ending with suffixes \verb+.shp+, \verb+.dbf+ and \verb+.shx+. The ...@@ -96,12 +164,40 @@ ending with suffixes \verb+.shp+, \verb+.dbf+ and \verb+.shx+. The
\verb+.shp+ file contains the positions, the \verb+.dbf+ file contains \verb+.shp+ file contains the positions, the \verb+.dbf+ file contains
the metadata, and the \verb+.shx+ file is an index. So the \verb+.shp+ the metadata, and the \verb+.shx+ file is an index. So the \verb+.shp+
file is the largest file of the three. But the three files form a file is the largest file of the three. But the three files form a
logical unit and you should never separate them. There is also a file logical unit and you should never separate them.
\verb+ishape_last.txt+ in the directory \verb+SHPC_(anti|cyclo)+ which
gives the last subscript in the shapefiles for each date. This file is The instantaneous eddies are identified, for a given orientation, by a
used to access directly any instantaneous eddy at any date. Finally, date and an eddy index at that date. See §
there is a file, \verb+grid_nml.txt+, which gives (in Fortran namelist \ref{sec:identification}. These are fields (column headers) in the dbf
format) the grid of SSH data from which the eddies were detected. files. The date is given as the number of days since January \nth{1},
1950, in the field \verb+days_1950+. The eddy index is between 1 and
the number of eddies at the date, in the field \verb+eddy_index+.
The file \verb+ishape_last.txt+ in the directory
\verb+SHPC_(anti|cyclo)+ is used to access directly any instantaneous
eddy at any date. Let $l_o(k)$ be the shape index in the shapefiles,
for a given orientation $o$, of the last instantaneous eddy at date
index $k$:
\begin{equation*}
l_o(k) = \sum_{k' = 0} ^k e_{\mathrm{max}, o}(k') - 1
\end{equation*}
So the shape index $i$ of the eddy with eddy index $e$ at date index $k$ is:
\begin{equation*}
i =
\begin{array}{|ll}
e - 1 & \mathrm{if}\ k = 0 \\
l_o(k - 1) + e & \mathrm{if}\ k \ge 1
\end{array}
\end{equation*}
Also, note that, if $k \ge 1$ then:
\begin{equation*}
e_{\mathrm{max}, o}(k) = l(k) - l(k - 1)
\end{equation*}
\verb+ishape_last.txt+ gives $l_o(k)$ for all $k$.
Finally, the file \verb+grid_nml.txt+ in the directory
\verb+SHPC_(anti|cyclo)+ gives (in Fortran namelist format) the grid
of SSH data from which the eddies were detected.
The shapefiles are in binary format, so you need special software to The shapefiles are in binary format, so you need special software to
read them. There is actually a large number of programs to read read them. There is actually a large number of programs to read
...@@ -131,25 +227,19 @@ the edges have a direction: the direction is chronological. ...@@ -131,25 +227,19 @@ the edges have a direction: the direction is chronological.
The file \verb+edgelist_(anti|cyclo)+ stores, for a given orientation The file \verb+edgelist_(anti|cyclo)+ stores, for a given orientation
of eddies, the whole graph as a list of edges. (This is a common graph of eddies, the whole graph as a list of edges. (This is a common graph
storage format.) Each line stores one edge: the origin node of the storage format.) Each line stores one edge: the origin node of the
edge followed by the target node of the edge. One edge is defined by a edge followed by the target node of the edge. A node is identified by
couple of integers: the date index and the eddy index at that a node index. Knowledge of the node index is equivalent to knowledge
date. The date index is the number of days since January \nth{1}, of the date and the eddy index. See § \ref{sec:identification}.
1950. The eddy index is between 1 and the number of eddies at the
date.
\section{Survival} \section{Survival}
We call survival the data that identifies several instantaneous The last part of the Matlab program has the task of recognizing
eddies as a same evolving physical object. The result of this analysis several instantaneous eddies as a same evolving physical object. We
is a dictionary of trajectories, stored in a file call this part of the analysis the survival part. The result of this
analysis is a dictionary of trajectories, stored in a file
\verb+traj_(anti|cyclo).json+. The key for each trajectory is the \verb+traj_(anti|cyclo).json+. The key for each trajectory is the
identifying number of the trajectory and the value is the identifying number of the trajectory and the value is the
corresponding list of instantaneous eddies. Here, an instantaneous corresponding list of instantaneous eddies. Here, an instantaneous
eddy is not identified by the couple date index and eddy eddy is identified by its node index.
index. Rather, it is identified by an equivalent identifying number
that is constructed from the date index and the eddy index. We could
call this number the \og node index\fg{}, since it identifies a node
in the abstract graph of overlapping. The relation between node index,
$n$, date index, $d$
\end{document} \end{document}
...@@ -4,7 +4,8 @@ ...@@ -4,7 +4,8 @@
SHPC_anti and SHPC_cyclo. The mat files are assumed to be in v7.3 SHPC_anti and SHPC_cyclo. The mat files are assumed to be in v7.3
(HDF5) format and will be converted to v6 format before being read in (HDF5) format and will be converted to v6 format before being read in
Python. The data in each input file is the set of detected Python. The data in each input file is the set of detected
instantaneous eddies at a given date. instantaneous eddies at a given date. The second, optional argument,
is the final date processed.
""" """
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment