Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modify fabric to calculate emissions from REST api #501

Open
sichen1234 opened this issue Mar 18, 2022 · 23 comments
Open

modify fabric to calculate emissions from REST api #501

sichen1234 opened this issue Mar 18, 2022 · 23 comments

Comments

@sichen1234
Copy link
Contributor

Modify fabric to get emissions factors from postgres database in data/postgres/ instead of directly from Fabric's couchdb, possibly by specifying ip address/port connection to postgres or using a docker container if really necessary.

@sichen1234
Copy link
Contributor Author

After #500, once the data is loaded into postgres, modify the chaincode that is accessing this seed data, for example in https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/main/emissions-data/chaincode/emissionscontract/typescript/src/lib/emissionsFactor.ts, to access it directly from postgres database.

As a test try accessing any postgres database from chaincode and see if it works.

@brioux
Copy link
Member

brioux commented Jun 15, 2022

@sichen1234 getting caught up on this issue as it relates to the data integration mentorship assigned to @Ackintya.
This blog discusses support for other state DBs, and highlights that Fabric only support couchDb or levelDB natively. He mentions a working document on using GO plugins for pluggable ledger state databases.
Is this what you have in mind? If so we need to put together a plan as the solution requires forking Fabric. (they recommend just using couch or levelDb)

@sichen1234
Copy link
Contributor Author

sichen1234 commented Jun 15, 2022 via email

@brioux
Copy link
Member

brioux commented Jun 15, 2022

Ok, I understand better the issue now.
The emissions will not be stored on the Fabric state DB (i.e., couch), instead the chaincode calls an external sevice (i.e., postgreSQL DB). This makes sense as it will avoid loading entire datasets into Fabric stateDb.

@brioux
Copy link
Member

brioux commented Jun 15, 2022

In this case I think a good first task for @Ackintya is to work on revising the chaincode to pull data from an external resource.

He can start by working on modifying the emissions chaincode to connect to the external postgreSQL database where all the emissions data is now being loaded.

This directory will replace the data loaded into fabric couchDB using egrid-data-loader.

@brioux
Copy link
Member

brioux commented Jun 15, 2022

@Ackintya here is an example of where the emissions chaincode will need to be modified.
The function getEmissionsRecord of the EmissionsRecordState should query external Postgres DB for the uuid of the emission record, instead of the internal stateDB (couch).

@brioux
Copy link
Member

brioux commented Jun 19, 2022

I have been thinking about this issue over the weeknd. There are the two approaches to using an external database to access emission records.

  1. This new approach: the chaincode is designed to query records using an API call. each fabric peer has to submit a query and expects the API to return same result.
  2. Old approach: An organization queries the eternal database (e.g., postgres) before calling the chaincode. The organization requests peers to store the record on the Fabric internal state DB (e.g., importUtilityFactor chaincode function). Peers handle audit requests by querying the shared state DB, no API calls.

@brioux
Copy link
Member

brioux commented Jun 19, 2022

@Ackintya, @sichen1234 is asking to implement 1. The organization tells the network where to get the data from, or the API address/functions are hard coded into the chaincode.

I read this threads that warns setting up external API calls inside Fabric could result in consensus issues, i.e., if peers receive different results.

There is still Fabric documentation on how to do this here, but may not be updated for recent versions. If we take this route, we need to research this further.

This can be an issue if running a network with a large number of peers and access to the external service is unstable -> peers don't receive the same result. With only a few organizations/peers on the audit channel, and stable connection to the external DB, this should not be a major issue.

If we stick to the old approach, my understanding is that the only difference from how the Fabric network is currently setup is that all the emission records are not written directly to the stateDB (e.g., using the egrid-data.load.sh script). Each organization can setup is own connection to an external emission database (e.g., the postgres data-loader). Only records submitted for audit are written to the state DB (so peers do not have to query the external service).

@brioux
Copy link
Member

brioux commented Jun 19, 2022

Benefits of the new approach (assuming consensus is not an issue).

  1. The chaincode can be configured to whitelist recognized/trusted emission record APIs
  2. No need for internal state DB to replicate existing emission record database

@sichen1234
Copy link
Contributor Author

sichen1234 commented Jun 20, 2022 via email

@sichen1234
Copy link
Contributor Author

These commits have code that show how to access amazon dynamodb from chain code:
74ec825 [74ec825]
b5ffcfe [b5ffcfe]
7c5c3fa [7c5c3fa]
c21a1c4 [c21a1c4]

This could be a good example for accessing external data sources from chain code, even if we're using postgres instead of dynamodb.

@sichen1234 sichen1234 changed the title modify fabric to get emissions factors from postgres database modify fabric to get emissions factors from REST api Jun 23, 2022
@sichen1234
Copy link
Contributor Author

Based on discussions this morning, it's better to create a REST api server to provide emissions calculations based on lib/emissions_data/src/lib/emissions-calc.ts Then the Fabric chain code will call this REST api to get the emissions and record them on the Fabric network.

This will simulate working with an external oracle service or API service that calculates emissions but does not provide the emisisons factors. It will also reuse the code in lib/emissions_data/src which is used by other apps like supply-chain.

@brioux
Copy link
Member

brioux commented Jul 5, 2022

@Ackintya
I am looking at your modifications to the emissionsRecordContract.ts

First, the rest API should expect from the Fabric chaincode any cmd + arguments combination to be relayed to the external DB server, and expect a CO2EmissionFactorInterface object as response. This is stored in the co2Emission variable in the Fabric chaincode.

The emissionsRecordContract.ts chaincode file will no longer access the following from the Fabric state DB:
getUtilityLookupItem
getEmissionsFactorByLookupItem
These calls should be dropped, this is all stored in the external DB.

There are different ways to get a CO2EmissionFactorInterface object

  1. Use existing postgres server cmd npm run pg:getData activity-emissions <scope> <level1> <level2> <level3> <level4> <text> <amount> [uom] that calls getCO2EmissionByActivity.
  2. Can also replicate how the chaincode currently gets emissions using utilityId: getUtilityLookupItem -> getEmissionsFactorByLookupItem -> getCO2EmissionFactor. However, this logic needs to be performed by the postgres DB server not Fabric. The methods have already been moved to data/src/repositories (see the links...)
    E.g., create a new pg command like npm run pg:getData utility-id-emissions <utilityId> <amount> [uom] that would getCO2EmissionByUtilityId.

In both cases the Fabric user has to tell the rest API to send the command and attributes to the server operating the external DB. This requires modifying inputs of recordEmissions.

@brioux
Copy link
Member

brioux commented Jul 5, 2022

As an example we can use activity-emissions cmd to get data for uuid 3622b20d-1e94-4490-ba8b-6e0b73910e2 by sending the following args
scope: "SCOPE 2"
level1: "eGRID EMISSIONS FACTORS"
level2: "USA"
level3: "STATE: VA"

or use getUtilityLookupItem(uuid) directly.

@sichen1234
Copy link
Contributor Author

I definitely think the better way to do is @brioux's option 1. The chain code should call the API, which should call getCO2EmissionByActivity.

@sichen1234 sichen1234 changed the title modify fabric to get emissions factors from REST api modify fabric to calculate emissions from REST api Jul 13, 2022
@sichen1234
Copy link
Contributor Author

@Ackintya Pls look in lib/src/emissions-utils.js process_electricity method. It maps the utility fields into the emissions factors fields. You can call this method from the REST API directly and map the fields to its input, or follow its logic to call the utility item lookup and then map the output to call get emissions factors. Since we're just getting electricity emissions in the Fabric chain code, it might be better to call process_electricity directly.

What do you think, @brioux

@brioux
Copy link
Member

brioux commented Jul 14, 2022

@sichen1234 clarification first i think you mean call lib/supply-chain/src/process_electricity...

@Ackintya, you can use your REST API (oracle) to call DB directly. Sorry I made a mistake, there is no need to use app/api-server.

You can use process_electricity as suggested above or any other function that uses the EmissionsFactorRepo, including the getCO2EmissionByActivity method we identified originally.

FYI - PostgresDBService.getInstance() establishes connection to the potgresDB.

Make sure your .env variables are configured to the values used by your postgres database if they are different from the default values set in data/src/config.ts

@brioux
Copy link
Member

brioux commented Jul 14, 2022

@sichen1234
These two functions (process_electricity and getCO2EmissionByActivity) expect inputs (ElectricityActivity and ActivityInterface) that do not align with what Fabric chaincode requests from an organization, i.e. recordEmissions(utilityId .....

The chaincode requires utilityID to get emissions-factor (corresponds to uuid in pg table). will need to update the chaincode inputs and higher level functions (e.g., swagger API) to accommodate different emission calculation requests (host and query/calc arguments) .

@Ackintya to avoid having to change the Fabric chaincode FOR NOW, you can setup a new EmissionsFactorRepo method similar to getCO2EmissionByActivity that is called by the Oracle. It would require only the uuid to query the emission-factor table using getEmissionFactor = async (uuid: string), not the activity data.

@brioux
Copy link
Member

brioux commented Jul 14, 2022

@Ackintya keep in mind irrespective of the source DB and method used, results should be converted into a general type validated by the Oracle based on the requirements of the Fabric chaincode. E.g., ActivityResult by process_electricity or CO2EmissionFactorInterface by getCO2EmissionByActivity.

@sichen1234
Copy link
Contributor Author

sichen1234 commented Jul 14, 2022 via email

@brioux
Copy link
Member

brioux commented Jul 14, 2022

@Ackintya to avoid having to change the Fabric chaincode FOR NOW, you can setup a new function, e.g., getCO2emissionsByUilityId(utilityId, thruDate, activity_uom, activity_amount)

It requires only the utilityId, uuid of utility_lookup_item table, and thruDate to query the emission-factor, instead of activity data.

Replicate what the Fabric chaincode does:

  1. Use utilityLookup = db.getUtilityLookupItemRepo().getUtilityLookupItem(utilityId) method from data/src/repositories/utilityLookupItem.repo.ts to get a UtilityLookupItemInterface object.
  2. Then use Use
    emissionFactor = db.getEmissionsFactorRepo().getEmissionsFactorByLookupItem(utilityLookup,thruDate)
    method from data/src/repositories/utilityLookupItem.repo.ts.

Finally calculate the emissions for the activity to return to Fabric!

@sichen1234
Copy link
Contributor Author

sichen1234 commented Oct 11, 2022 via email

@brioux
Copy link
Member

brioux commented Oct 11, 2022

The approach adopted in PR 616 was to introduce a new oracle api into the chaincode, rather than an explicit connection to the existing rest API.

The oracle can then be configured to relay connections to an approved rest-api with the required DB connection.

For now the DB connection was hardcoded into the Oracle API. The calls to the db repositories required by the fabric chaincode were not yet set up in the postgres rest-api. It needs to be updated.

I.e.,

query_response = await db.getEmissionsFactorRepo().getCO2EmissionFactorByLookup(lookup,usage,usageUOM,thruDate);

The chaincode and swagger-api tests were set up to get emissions by lookup item, and not the newer 'getEmissions' by activity added to the rest-api trpc routers.

I am looking for a candidate to :

  1. work on modifying the chaincode and oracle api to request emission records using existing rest-api routers,
  2. extend the rest-api to handle the original emission requests setup within the fabric chaincode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants