{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# HyP3 InSAR Job Submission and Download Tutorial\n", "\n", "HyP3 processes whole frame interferograms using GAMMA, while it processes burst data using ISCE2. The `job_type` for GAMMA and ISCE2 is \"INSAR_GAMMA\" and \"INSAR_ISCE_BURST\" respectively. We design classed `InSARMission` and `InSARBurstMission` to handle the GAMMA and ISCE2 jobs respectively. Those two classes are shared same API, so you can use them in the same way.\n", "\n", "This notebook demonstrates how to submit and download HyP3 \"INSAR_GAMMA\" jobs using the `DataDownloader` packages. You can replace `InSARMission` with `InSARBurstMission` to submit and download \"INSAR_ISCE_BURST\" jobs.\n", ":::\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We encourage you to get the granule names from [ASF Data Search Vertex](https://search.asf.alaska.edu/) and download the metadata in the `.geojson` format. This will allow you to use the `DataDownloader` to download the data.\n", "\n", "![Download GeoJSON](/_static/images/asf_download_geojson.png)\n", "\n", "load the metadata from the downloaded `*.geojson` files using `geopandas`" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import geopandas as gpd\n", "from data_downloader import services\n", "from data_downloader import downloader as dl" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "asf_file = \"/Volumes/Data/asf-datapool-results-2024-03-29_11-24-18.geojson\"\n", "df_asf = gpd.read_file(asf_file)\n", "df_asf = df_asf[df_asf.processingLevel == \"SLC\"] # only use SLC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "create a pandas Series where the granule names are the values, and the dates of the granules are used as the index." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# dates format in df_asf are mixed, convert them to standard date format\n", "dates_str = (\n", " pd.to_datetime(df_asf.startTime, format=\"mixed\")\n", " .map(lambda x: x.strftime(\"%F\"))\n", " .values\n", ")\n", "dates = pd.to_datetime(dates_str)\n", "\n", "# create a series with dates as index and sceneName as values\n", "granules = pd.Series(df_asf.sceneName.values, index=dates)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To generate pairs to be submitted, you can use the `PairsFactory` class in the `FanInSAR` module." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# generate pairs using given dates\n", "pairs_factory = fis.PairsFactory(dates)\n", "pairs = pairs_factory.from_interval(max_interval=3, max_day=180)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Submit jobs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To begin, we need to initialize the `HyP3Service` and `InSARMission` classes. The `HyP3Service` class handles the login and retrieval of user information, while the `InSARMission` class is responsible for submitting the INSAR_GAMMA jobs.\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "service = services.HyP3Service()\n", "mission = services.InSARMission(granules=granules, service=service)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since the `look_vectors`, `dem`, and `inc_map` data are almost the same for all interferograms, we can submit them only once and then ignore them for subsequent submissions. This approach will significantly save storage space " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# submit first job to get shared data (DEM, incidence angle map, etc.)\n", "args = {\n", " \"name\": \"RGV_2024_03_first\", # give a unique name to find this job later\n", " \"include_look_vectors\": True,\n", " \"include_dem\": True,\n", " \"include_inc_map\": True,\n", " \"looks\": \"10x2\",\n", " \"phase_filter_parameter\": 0.5,\n", "}\n", "mission.job_parameters = args\n", "mission.submit_jobs(pairs[:1], skip_existing=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# submit the rest of the jobs without shared data (save time and storage for downloading)\n", "args = {\n", " \"name\": \"RGV_2024_03\",\n", " \"looks\": \"10x2\",\n", " \"phase_filter_parameter\": 0.5,\n", "}\n", "mission.job_parameters = args\n", "mission.submit_jobs(pairs, skip_existing=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the status of missions" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "HyP3Service(\n", " user_id=fanchengyan1995, \n", " remaining_credits=10, \n", " succeeded=976,\n", " failed=0,\n", " pending=557,\n", " running=2\n", ")" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "service.flush()\n", "service" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Submit remaining jobs using a new account" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since one account can only submit a limited number of jobs, we can use a new account to submit the remaining jobs. \n", "To do this, we need to retrieve the pairs that have already been submitted and then submit the remaining pairs using the new account.\n", "\n", ":::{note}\n", "Please note that submitted jobs may be failed and need to be resubmitted in the future, as pending jobs may fail.\n", ":::" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " primary secondary\n", "0 2014-10-10 2014-11-03\n", "1 2014-10-10 2014-11-15\n", "2 2014-10-10 2014-11-27\n", "3 2014-10-10 2014-12-09\n", "4 2014-11-03 2014-11-15\n", ".. ... ...\n", "877 2020-06-28 2020-07-22\n", "878 2020-07-04 2020-07-10\n", "879 2020-07-04 2020-07-16\n", "880 2020-07-04 2020-07-22\n", "881 2020-07-04 2020-07-28\n", "\n", "[882 rows x 2 columns]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pairs_submitted = mission.jobs_to_pairs(\n", " mission.jobs_on_service.succeeded\n", " + mission.jobs_on_service.running\n", " + mission.jobs_on_service.pending\n", ")\n", "pairs_submitted" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " primary secondary\n", "0 2015-09-11 2015-09-23\n", "1 2015-09-11 2015-10-05\n", "2 2015-09-11 2015-10-17\n", "3 2015-09-23 2015-10-05\n", "4 2015-09-23 2015-10-17\n", ".. ... ...\n", "618 2024-02-20 2024-03-15\n", "619 2024-02-20 2024-03-27\n", "620 2024-03-03 2024-03-15\n", "621 2024-03-03 2024-03-27\n", "622 2024-03-15 2024-03-27\n", "\n", "[623 rows x 2 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pairs_remain = pairs - pairs_submitted\n", "pairs_remain" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To avoid conflicts with user information stored in the `.netrc` file, you can log in to a new account for the HyP3 Service using the prompt." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "HyP3Service(\n", " user_id=fanchengyan, \n", " remaining_credits=10000, \n", " succeeded=0,\n", " failed=0,\n", " pending=0,\n", " running=0\n", ")" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "service1 = services.HyP3Service(prompt=True)\n", "service1" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cfaadec8e9974e339de91886380c7e98", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Submitting jobs: 0%| | 0/623 [00:00