Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit a dataset #498

Closed
NakulManchanda opened this issue Jul 17, 2020 · 49 comments
Closed

Edit a dataset #498

NakulManchanda opened this issue Jul 17, 2020 · 49 comments
Assignees
Labels
app editor enhancement New feature or request zss This issue has some dependency on zss
Milestone

Comments

@NakulManchanda
Copy link
Member

NakulManchanda commented Jul 17, 2020

ZSS needs capability to create sequential and partitioned datasets.

ZSS DatasetService

Examples from zosmf:
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.izua700/IZUHPINFO_API_CreateDataSet.htm#CreateDataSet

Create Sequential Dataset

POST /zss/datasetContents/JIAHJ.REST.TEST.NEWDS HTTP/1.1
Request body:

{"volser":"zmf046","unit":"3390","dsorg":"PS","alcunit":"TRK","primary":10,
"secondary":5,"avgblk":500,"recfm":"FB","blksize":400,"lrecl":80}

Create Partitioned Dataset:

POST /zss/datasetContents//JIAHJ.REST.TEST.NEWDS02 HTTP/1.1
Request Body

{"volser":"zmf046","unit":"3390","dsorg":"PO","alcunit":"TRK","primary":10,
"secondary":5,"dirblk":10,"avgblk":500,"recfm":"FB","blksize":400,"lrecl":80}

Properties for request payload:
Table 1. Request body to create a sequential and partitioned data set

Field	Type	Description
volser	String	Volume.
unit	String	Device type.
dsorg	String	Data set organization.
alcunit	String	Unit of space allocation.
primary	Integer	Primary space allocation.
secondary	Integer	Secondary space allocation.
dirblk	Integer	Number of directory blocks.
avgblk	Integer	Average block.
recfm	String	Record format.
blksize	Integer	Block size.
lrecl	Integer	Record length.
storeclass	String	Storage class.
mgntclass	String	Management class.
dataclass	String	Data class.

Result:

201 Created
Content-Type:  application/json; charset=UTF-8
Content-Length:  0
Date:  Wed, 16 Sep 2015 10:54:21 GMT 
@John-A-Davies
Copy link

To that list of properties we should add:

Data set name type -- string -- PDS, PDS-E or SEQ

@John-A-Davies
Copy link

John-A-Davies commented Jul 22, 2020

A note about locking. Datasets can be opened for sharing (DISP=SHR) and if you want exclusive access, you use (DISP=OLD). Locking at the member level is also possible, and ISPF EDIT does this when you open a member.

So a simple model would be to have an EDIT option when you open a dataset or member. The default would be BROWSE

image

@NakulManchanda @1000TurquoisePogs That's the way dataset access works on z/OS. Do you think this is acceptable, rather than deciding after a long edit session on a shared file,
to then try to resolve your updates with those of others that may have occurred in the meantime?

@1000TurquoisePogs
Copy link
Member

There's a lot of unfinished work here that you may be able to continue and improve upon to save time:
zowe/zowe-common-c#68
zowe/zowe-common-c#69
zowe/zowe-common-c#109
zowe/zss#66
zowe/zss#67
zowe/zss#144

@jordanfilteau1995
Copy link
Contributor

Hi,

I recommend we use, https://www.ibm.com/support/knowledgecenter/SSLTBW_2.1.0/com.ibm.zos.v2r1.bpxb600/rda.htm, to handle data set creation. There's a lot of work done already, so I suppose we could use DYNALLOC. I'm just not sure why you'd want to when BPXWDYN is available.

@John-A-Davies
Copy link

I'm trawling through the 12 or so repos (I note Sean's comment that I should just focus in the repos "zss" and "zowe-common-c" for all the mainframe tasks), and I note Jordan's comment about using BPXWDYN (looks like it calls DYNALLOC) rather that DYNALLOC directly. BTW I notice that invocations of many SVC's, including SVC 99 (DYNALLOC) are already present in zowe-common-c. I don't plan to use them for this task.

The parms you can specify for BPXWDYN look fine to me, and I think we want to access datasets via USS, rather than use a TSO session where we can issue ALLOCATE, or some environment where we can issue SVC 99.

I've also seen fopen used on what appears to be the format "//'data.set.name'"

@John-A-Davies
Copy link

Work plan

  1. Enq/deq dataset or member
  2. Write RECFM=V dataset or member
  3. Un-disable Editor File  SAVE button
  4. Add ENQ/DEQ call before/after file EDIT
  5. Implement timeout policy
  6. Implement ENQ revoke policy
  7. Obtain dataset DCB before contents
  8. Implement RECFM=F support
  9. Add dataset/member create feature
  10. Expose VSAM dataset editing
  11. Expose VSAM dataset creation

Work progress
Numbers 1 and 2 above have been demonstrated manually, as proof that the code to do that is present.

@John-A-Davies
Copy link

John-A-Davies commented Sep 9, 2020

Zowe REST API to access z/OS datasets.

I will enhance this API (see zlux-app-server/doc/swagger/fileapi.yaml ) datasetContents.

  1. to take a user-level lock on the dataset before it's read with GET

  2. to enable update of the dataset with PUT

  3. to enable dataset creation with POST

For GET, the lock will be an exclusive ENQ on the dataset or member. It will be specified in a custom HTTP header, not as a query parameter. The header will be optional, so the default will be "no lock", as at present.

For PUT, the lock will be released by DEQ. The default will be DEQ=TRUE, but you can specify DEQ=FALSE in the HTTP header if you want to retain the lock. (correction; I think this should be POST, because the side-effect of DEQ makes it not idempotent. So PUT if the user wants "SAVE and continue", and POST if they want "SAVE and EXIT".

@1000TurquoisePogs
Copy link
Member

I believe there is currently a DELETE, for removing a dataset. Do you think a lock is needed there as well?

@John-A-Davies
Copy link

No, it's an atomic action for a dataset, including a PDS. However, if you want to delete a MEMBER, then you probably want to silently ENQ the PDS because you will effectively be updating the directory. Curiously, z/OSMF does not externalise an offer to ENQ when deleting a member of a PDS, but maybe it's a given that exclusive ENQ on the PDS is required for that, just for the (short) duration of the DELETE.

If the dataset or member is in use, DELETE will fail. Taking an exclusive ENQ first will fail too, so all you gain is a clearer reason why the request failed.

@NakulManchanda
Copy link
Member Author

NakulManchanda commented Sep 23, 2020

@John-A-Davies @1000TurquoisePogs @ifakhrutdinov @OnnoVdT
Summary of meeting to discuss john's research on pessimistic locking

Design Decisions

  • what lock to acquire - broadest possible seems like - SPFEDIT EXC
  • implementation will have mechanism to provide header for save(keep holding lock) & save and quit (save and release lock)

Need answers to questions

  • for a short lived http request, we need a mechanism on zss cross memory server a way to maintain n manage locks which goes beyond the scope of request.
  • Avoiding Deadlocks - way to expire, settimeout on zss server - for ungraceful exits - like client closing the browser -
  • need an API or operator command to tell sysadmin who owns a lock, as zss is shared user space, system knows that zss holds a lock, and not about specific user, sys admins need to query zss
  • in impersonation mode we also need a way to determine if user has ability to alloc dataset or not

@John-A-Davies
Copy link

John-A-Davies commented Sep 23, 2020 via email

@NakulManchanda
Copy link
Member Author

NakulManchanda commented Sep 23, 2020

@John-A-Davies @1000TurquoisePogs @ifakhrutdinov @OnnoVdT

Alternate proposal to go towards Optimistic Locking route

Comparison between optimistic & pessimistic locking:
https://stackoverflow.com/questions/129329/optimistic-vs-pessimistic-locking

Optimistic locking approach using e-tags:
https://sookocheff.com/post/api/optimistic-locking-in-a-rest-api/

Flow of request:

  • GET - Need to attach/compute & attach
    • last modified information or checksum information
      either to every get request or if we differentiate, request which has intention of modifying file
  • Save/POST
    • need to query version info,
    • if match then acquire lock, complete save and release lock
    • if not matched, then abort

Trick is: how to determine file on server is newer

  • Use file checksum, and attach it as e-tag header in http request
    Cons:

    • hashing is order of size of file
    • it can be resource intensive
  • Potential alternate could be last-modified system timestamp if it can be used as safe alternative - needs more research

    • ask system for that info if available
      As per John.
    • No last modified information at dataset level same for PDS and seq.
    • It might be discoverable from the DSCB if you peeked

Advantage over pessimistic:

  • no need to manage lock for long duration in zss server

Need answers to questions

  • which strategy to use checksum or last-modified for dataset versioning?
  • if file on server is newer, need a sophisticated way to inform user or handle merges in sophisticated way
    - High sophistication - automatic conflict resolution
    - Medium sophistication - prompt user for every conflict resolution
    - Low sophistication - just cancel the request
  • Each small save can be expensive as we need to verify file version, acquire & release lock on each save.
    User can hit save multiple times during file edit session - 10s or 100s of times
  • same problem as other approach above - not sure for short lived request we still need a impersonation mode
    • in impersonation mode we also need a way to determine if user has ability to alloc dataset or not

@John-A-Davies
Copy link

John-A-Davies commented Sep 23, 2020

Joe Winchester 4:59 PM
Pessimistic is required for z/OS data sets
I know from z/OS Explorer that without it system programmers won’t roll it out
for some PDS members like PPT entries or PARMLIB they are too critical to not be pessimistic
optimistic is good for a data set member that is part of an SCM like github or endeavor or something that provides deep history
ENQUE is needed and should be done first. Optimistic can come later

@John-A-Davies
Copy link

DAVID MCKNIGHT 6:01 PM
RSE as a generic framework has been around for about 20 years. IDz/zExplorer's use of RSE has been around for about 16 years and the MVS support it brought in used pessimistic locking from the start in order to be aligned with ISPF. The FTP work including in z/OS Explorer was an independent effort (by the CICS explorer team) to provide MVS access via FTP. I don't know for sure but I don't think FTP on MVS supports locking.
Hence the optimistic approach.
Now there's also RSE API, which is a separate server from legacy RSE. Currently RSE API is used without locking support (i.e. optimistic) but supports Etags (like z/OS MF).

@Joe-Winchester
Copy link
Member

Pessimistic lock is required for rollout to z/OS customers. If the file is a PPT or PARMLIB my experience from working with other file editing tools is that they insist on pessimistic locking being in the file editor.
Optimistic locking may work for data set members if there is a source code manager and/or for developer source code files.

I'd like to see File Editor implement pessimistic locking as the MVP for moving forward from beta mode which is story number 1). It'd be good also to have some kind of visual to show the file is locked, perhaps a padlock.

Optimistic locking is a separate story, and perhaps there could be a way to show with an unlocked padlock that the file isn't locked and allow a way to place the lock on if the user decides they want to lock it.

Ideally when both are done it could be the system programmer who configures Zowe who makes the central choice which is the default, and also perhaps for which data set matching patterns so that some data set patterns are pessimistic and some are optimistic.

@NakulManchanda
Copy link
Member Author

NakulManchanda commented Sep 25, 2020

Hi @1000TurquoisePogs @John-A-Davies @ifakhrutdinov

  • Based on discussion above we believe pessimistic is a way to go.
  • for a short lived http request, we need a mechanism on zss cross memory server a way to maintain n manage locks which goes beyond the scope of request.
  • Avoiding Deadlocks - way to expire, settimeout on zss server - for ungraceful exits - like client closing the browser -
  • need an API or operator command to tell sysadmin who owns a lock, as zss is shared user space, system knows that zss holds a lock, and not about specific user, sys admins need to query zss
  • in impersonation mode we also need a way to determine if user has ability to alloc dataset or not

I think we need separate lock server process, to manage locks and transaction.

  • Sean, briefly mentioned to fork a child process which manage object in memory/or on disk to maintain locks, this object can live longer than context of http request
    -We should maintain a map or table in memory with following attributes
    - id, lock_id, lock_type, is_exclusive, dataset name, request_timestamp, requested_by_user, expire_timestamp, is_request_accepted_or_rejected, reason_code_rejection, reason_message_rejection, is_released
    additional optional attributes for future
    • last_modified, checksum - these two can be used in shared lock scenario before saving
  • all our transactions/scenario can revolve around this lock table data structure
  • request_timestamp is required to resolve deadlocks in ungraceful exit scenarios, and automatically expire lease of locks after certain timeout.
  • this lock server can be queried by operator command to report on which particular user is holding a lock
  • all permissions and checks related to user authorization of dynamic alloc etc is responsibility of server as well

Separate process hopefully will simplify zss http server code as well, which can convert lock server responses to appropriate http status codes and respond to user.

Hope, this makes sense. Need a discussion that above suggestion can fly. Thanks!

@OnnoVdT
Copy link

OnnoVdT commented Sep 25, 2020

Nakul,
Note that the system (GRS) already stores some of this info (isExclusive, DSN, ASID, TCB) and you can get it with a system call. I know there is also a way to keep this in sync across a sysplex, but I'm not sure if the system name is also stored in GRS in that situation.

@John-A-Davies
Copy link

John-A-Davies commented Sep 25, 2020

Bear in mind that some of the actions in this area require APF authorisiation. We want to avoid the requirement for the zssServer to be APF authorised.

For example, if you want to acquire a SYSDSN lock like ISPF EDIT does on the dataset, you need to be authorised, at least for EXC access. I have not tried SYSDSN with SHR.

For z/OS Explorer, RSE threadpool and daemon are APF-authorized... you can find more about that here:
https://www.ibm.com/support/pages/sites/default/files/inline-files/$FILE/zexp_host_ref_1.pdf

@John-A-Davies
Copy link

@struga0258 FYI as discussed today.

Test Dataset Edit.docx

@1000TurquoisePogs 1000TurquoisePogs added app editor enhancement New feature or request zss This issue has some dependency on zss labels Oct 1, 2020
@1000TurquoisePogs
Copy link
Member

On the topic of dataset creation: datasetMetadata code that we have already can determine properties of existing datasets, therefore the creation API can support the "allocate like" feature by using a dataset name as input, and doing the lookup of properties internal to zss.

@John-A-Davies
Copy link

After discussion with @1000TurquoisePogs just now, I'm going to try the following experiment:

Create a separate URI for dataset ENQ. This will be handled with

  • doImpersonation=TRUE
  • runInSubTask = FALSE

Which will run in the zss server's core, so the ENQ will persist.

The zss core will thus have to manage a list of ENQs. Code to operate this as a hash table already exists in unixFileService.c.

@ifakhrutdinov
Copy link

@John-A-Davies you cannot impersonate without running in a subtask. It'd be bad to change the impersonation of the main thread.

@ifakhrutdinov
Copy link

@John-A-Davies

  • Can something more generic be use instead of /datasetEnqueue and /datasetDequeue? For example, /openDataset and /closeDataset. The reason is, word enqueue would leak implementation details.
  • How will the operator know which user is holding a dataset someone else is trying to access?

@OnnoVdT I don't think POST/WAIT would be needed, since you can just use POSIX synchronization mechanisms.

@John-A-Davies
Copy link

@OnnoVdT I'm using semctl(), semop(), semget(), all inside the zssServer code.

@John-A-Davies
Copy link

@ifakhrutdinov

  • I don't mind too much about the names; for me, lock and enqueue are very similar. But I do think we should keep the enqueue and the /datasetContent separate, for the avoidance of doubt. /openDataset might imply that there is a choice of open mode; read/write/exclusive and so on. Perhaps have /datasetLock and /datasetUnlock?

  • Knowing who is holding a dataset, so as to tell the operator or another user, is a feature I'd like to defer until we get the basic ENQ, GET, PUT, DEQ cycle working.

@OnnoVdT
Copy link

OnnoVdT commented Nov 12, 2020

This clearly shows my home-turf is different than yours, :) I didn't even consider POSIX C would have similar constructs.

@John-A-Davies
Copy link

@OnnoVdT Would be nice if it supported POSIX, but I think the z/OS implementation is called SYSTEM V. It has different primitives from POSIX.

@John-A-Davies
Copy link

We discussed the API and which HTTP methods to use for each URI. My suggestions were:

   HTTP method | URI              | Dataset       | action
---------------+------------------+---------------+-------
1  GET         | /datasetEnqueue  | DATA.SET.NAME | ENQ
2  GET         | /datasetContents | DATA.SET.NAME | GET
3  PUT         | /datasetContents | DATA.SET.NAME | PUT
4  DELETE      | /datasetEnqueue  | DATA.SET.NAME | DEQ
5  POST        | /datasetDequeue  | DATA.SET.NAME | DEQ

We thought that there are no absolute answers. We were OK with line 4 instead of line 5. Comments welcome!

@Joe-Winchester
Copy link
Member

From an outside in design two interactions for opening a file

  • User A wants to open a file. They can either ask for a lock to be placed so the contents cannot be modified by another user, or else they want to open read only without a lock
  • User B wants to open a file. If they ask for a lock then if it can't be given they need to be told why

The API for this is best as one call, to avoid race conditions

  • GET /datasetContents DATA.SET.NAME - header ENQUEUE=YES
    This should place an ENQUEUE and return contents. If the header ENDQUEUE=YES is not set then return contents without placing a lock. If the lock cannot be placed return the 423 code together with a response body that indicates the user who has the lock. (see https://www.restapitutorial.com/httpstatuscodes.html). The requester then should do a GET without the ENQUEUE=YES header to have read only contents.

When user A is finished editing the file they should do a PUT with the contents, and with DEQUEUE=YES. This would release the lock. If user A decides to discard and not update the contents then a POST.

Following these through the APIs needed are 2, 3 and 5 only - with the difference between that the header data to put the lock flows on 3.

@ifakhrutdinov
Copy link

In my opinion, term enqueue may not be something newcomers or even normal z/OS users know, this is usually used by systems programmers. We should use terminology familiar to normal users. Maybe DISP or mode={read|write}?

@ifakhrutdinov
Copy link

@John-A-Davies , how do other similar servers implement this on other platforms (e.g. for accessing files)? Perhaps there is a standard approach we shouldn't reinvent.

@John-A-Davies John-A-Davies added this to the 20PI4S6 milestone Nov 24, 2020
@John-A-Davies
Copy link

John-A-Davies commented Nov 24, 2020

API proposal for the code to be delivered in 2020 Q4:

   HTTP method | URI              | Dataset       | action
---------------+------------------+---------------+-------
1  GET         | /datasetEnqueue  | DATA.SET.NAME | ENQ
2  GET         | /datasetContents | DATA.SET.NAME | GET
3  POST        | /datasetContents | DATA.SET.NAME | PUT
4  DELETE      | /datasetEnqueue  | DATA.SET.NAME | DEQ

@John-A-Davies
Copy link

@struga0258 FYI the API calls above are the ones the Editor uses for this feature. They should be tested.
I don't have a list of HTTP response codes that they issue yet but I will create one.

The new API is not to be published as yet.

@John-A-Davies
Copy link

John-A-Davies commented Dec 1, 2020

The set of current PRs for this issue are here:

Pull requests

Zlux:
zowe/zlux-file-explorer#105
zowe/zlux-platform#66
zowe/zlux-app-manager#291
zowe/zlux-app-server#140
zowe/zlux-editor#176
zowe/zlux-server-framework#253
zowe/zss-auth#30

zss:
zowe/zss#223
zowe/zowe-common-c#184

@John-A-Davies
Copy link

John-A-Davies commented Dec 4, 2020

Implementation questions:

  1. Can a different user DEQ a dataset that's currently ENQ by the first user?
    If it’s held just by the file name and not including the user, can user A open an editor and lock the file - and if user B comes along how do you know they aren’t the same as A and treat them differently - so that user A can edit and save, whereas user B can’t

  2. What happens to un-released semaphores held by zss when the zss server terminates?

@OnnoVdT
Copy link

OnnoVdT commented Dec 4, 2020

for Q1, Can a different user DEQ a dataset that's currently ENQ by the first user?
I would keep it simple, no. If you lock a dataset for editing, it is tied to that session (ISPF does the same in a sysplex). The only one who should be able to release that lock is the user who did the lock, or a system operator with a console command. You don't want random folks removing your lock because they feel like it.

@OnnoVdT
Copy link

OnnoVdT commented Dec 4, 2020

For Q2, semaphores held by zss when the zss server terminates. My understanding is ZSS does the actual enqueue, and thus according to the system, ZSS is the owner. When the ZSS address space terminates, the system will release all locks held by that address space as part of normal RTM cleanup. Your tracking file will still know about the locks so you can re-establish them once ZSS comes up, or have a hot stand-by take over, but there will be a window where the files are not locked while the owner thinks they are.

@John-A-Davies
Copy link

John-A-Davies commented Dec 4, 2020

@OnnoVdT Just to be clear, the real ISGENQ dataset enqueue and the semaphore are both held in the process that's created to impersonate the user, so the ENQ is held in the name of the user who issued the GET datasetEnqueue. The ENQ is gracefully released when the user posts the semaphore, or ungracefully when the process dies.

The dataset-to-semaphore relationship is held in memory of the zss process.

@John-A-Davies
Copy link

John-A-Davies commented Dec 11, 2020

Baseline statement of known issues with code in the zlux and zss repos, in branches named dsedit#498 today.

  1. Things that are incorrect
  1. Updates to RECFM=F datasets will fail if the record gets too long; select the whole file to see the hidden spaces at EOL, remove them and re-do the SAVE.
    Updates to RECFM=F datasets fail if the record gets too long #576
  1. Invalid record for dataset, recordLength=130 but max for dataset is 80 (is this because \r is not counted as newline, so the whole file appears as a single line?)
    Invalid record for dataset because EOL is \r  #577
  1. Updates to (RECFM=F) files seem to save only the last record, which is repeated (=issue 232; padding ).
  1. The editor incorrectly wraps the file body in an extra layer of JSON if the dataset SAVE fails.
    Editor incorrectly wraps the file body in an extra layer of JSON if the dataset SAVE fails. #578
  1. You can’t update members when ISPF is viewing the member list (solution available from @John-A-Davies ) File Editor can not update dataset members when ISPF is viewing the member list #573
  1. API URL does not work when DSN contains the hash sign (#) --- API URL does not work when DSN contains # zss#234
  1. Features that need to be added
  1. closing editor - release all locks on closing editor
    Release remaining dataset locks when file editor stops #570
  1. timeout / heartbeat - release lock on closing browser window or idle for a specified timeout time
    Release remaining dataset locks when file editor stops #570
  1. Pop-up messages don’t contain error info e.g. which user holds the lock for a dataset you want to edit.
    You need to make that username available: see Reveal who owns a lock #582

@NakulManchanda
Copy link
Member Author

NakulManchanda commented Dec 14, 2020

These were some of the design goals:

  • 1) For a short lived http request, we need a mechanism on zss cross memory server a way to maintain n manage locks which goes beyond the scope of request.
  • 2) Avoiding Deadlocks - way to expire, settimeout on zss server - for ungraceful exits - like client closing the browser
  • 3) Need an API or operator command to tell sysadmin who owns a lock, as zss is shared user space, system knows that zss holds
  • 4) A lock, and not about specific user, sys admins need to query zss
    in impersonation mode we also need a way to determine if user has ability to alloc dataset or not

1,4 - done
2,3 - left

2(avoid deadlock) is absolutely necessary, will be accomplished with these two items, as commented above by John.

  • closing editor - release all locks on closing editor
  • timeout / heartbeat - release lock on closing browser window or idle for a specified timeout time

FYI @John-A-Davies @Joe-Winchester @1000TurquoisePogs @ifakhrutdinov

@John-A-Davies
Copy link

John-A-Davies commented Jan 12, 2021

Discussion of item 2; avoiding deadlock.

The purpose of this item is to release locks the WEBUI user has taken on datasets in the cases when the user has not specifically released them.

Created #570 for this.

@1000TurquoisePogs
Copy link
Member

We ended up implementing this for now zowe/zss#355 and it seems to be working simply. So, lets close this epic as completed. If specifically pessmistic writing is needed in the future, we'll take a look at this work again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app editor enhancement New feature or request zss This issue has some dependency on zss
Projects
None yet
Development

No branches or pull requests

7 participants